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Abstract 


This course will survey the use of computer graphic techniques and problems in 
producing illustrations for technical publications. The lectures will reference 
published material but will gather unpublished research and techniques into a useful 
set of course notes. The survey begins with the two-dimensional device-independent 
imaging model that forms the basis for recent page description languages, Xerox 
Interpress and Adobe Systems PostScript. Then the sources of illustrations and their 
organization into documents are covered from the point of view of an implementor. 
The focus shifts in the afternoon to the problems of rendering documentation graphics 
with the quality, style, and typography expected in technical publications. The final 
topic addresses automated techniques for creating graphical presentations. 
Throughout the course, unsolved problems and thorny technical issues will be 
highlighted to point out areas needing further work. 
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Preface 


This course on documentation graphics provides an opportunity to concentrate on the 
issues Of incorporating illustrations within documents. There are lots of tools to create 
documents, lots of tools to create illustrations, and lots of activity to integrate the two. What 
problems remain? The major problems are the lack of quality in the combined result and the 
lack of integration between the illustration and textual worlds. 


These course notes present a series of essays and republished papers on topics concerning 
the production of quality illustrations using documentation graphics tools. The course begins 
by identifying several graphic arts quality issues. One of these quality issues is typography, 
and we address the philosophy of marking and the design considerations in creating a new 
digital typeface, Lucida. 


The foundation for documentation graphics is an imaging model suitable to the goal of 
meeting graphic arts quality standards. This imaging model is device-independent and 
encompasses the presentation of information on two-dimensional media. 


Page description languages capture the rendering of documents and illustrations. These 
notes outline the history of their evolution and compares the two major languages: PostScript 
and Interpress. The procedural aspect of these languages is crucial to supporting the range of 
imaging requirements. Interpress extends the page rendering notion to document production 
control for distributed multi-purpose documents. Rendering is not a sufficient solution. 
Capturing the content of illustrations and documents for revision requires representation and 
interchange standards for content. 


Creating illustrations is a major focus of documentation graphics. These notes survey 
electronic sources of images to provide a framework for integrating these sources. The notion 
of idiomatic illustrators is presented in unpublished work that underlies an integrated 
graphics environment. Papers on implementation techniques and color reproduction 
problems for illustrator programs extend the discussion of documentation graphics systems. 


Controlling the appearance of illustrations is introduced in the notion of graphical style. 
Automating the presentation of information through graphic illustration is the topic of 
current research using artificial intelligence planning and rule-based techniques. The notes 
end with a summary of traditional document production sequence where documentation 
graphics will fit. 
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Quality Issues for Documentation Graphics 


Richard J. Beach 
Xerox Palo Alto Research Center 


The Graphic Arts Standards of Quality 


We wish to consider, and strive to attain, the standards of quality commonplace among 
practitioners and artifacts in the graphic arts community. Technical journals and text books 
are frequent examples of these standards. Our documentation graphics course uses the 
graphic arts quality standards as a theme throughout the presentations. 


The quality issue is mainly a subjective one, constrained by economic and technological 
factors. While we may wish to present colored information with the quality of National 
Geographic, the resolution of common laser printers coupled with their ubiquitous lack of 
color dictates that compromises must be considered! Nonetheless. identifying the quality 
measures and highlighting the quality issues will provide a rational means of assessing our 
success in meeting the aesthetic standards we encounter in publications that we enjoy using 
and reading. 


Components of Quality in Documentation Graphics 


The Jaggies 


No Jaggies! The classic SIGGRAPH t-shirt design superimposing a staircase and the 
international NO design (red circle and slash) eloquently speaks to improving the quality of 
computer-generated graphics. The jaggies introduce false messages. especially easy-to-notice 
corners or staircases in supposedly smooth and straight lines and edges. 


There are two major solution strategies: increased resolution and smoothing techniques. 
Increasing resolution involves changing devices, which argues both for device-independent 
representation of images and for a systems approach to permit access to a variety of devices. 
With relatively coarse-resolution display screens. anti-aliasing techniques to eliminate the 
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jaggies have received considerable attention in computer graphics research. With laser 
printers that are capable of binary-valued image pixels and posses only moderate resolutions, 
the difficulties are considerable. The Warnock and Wyatt paper on Device Independent 
Graphics Imaging Model is fundamental to the page description languages that address this 
issue. 


High quality printing increases the expectation level for typographic (or ‘typeset’) 
quality. Reproducing typefaces are a crucial quality issue. Chuck Bigelow’s following paper 
on the Design of Lucida discusses how that typeface design attacks the quality issue. 


Font Choices and Typography 


A common standard of graphic arts quality is the use of typographic fonts. The choice of 
type families is often subjective and driven by faddish concerns. Three type family names 
have become quite ubiquitous in documentation graphics: Helvetica, Times Roman, and 
Computer Modern. However, the quality of reproduction is sometimes so poor that the 
resemblence of the type to those families is in name only! 


Type fonts stress the reproduction quality of systems because they contain so much visual 
information in such a small space. The rythym of the strokes, the swell of the serifs, the 
contrast between thick and thins, and the ‘color of the resulting page are all subjective 
measures with which we subconsciously develop experience during our lifetime of reading. 
Attempting to match these typographic expectations on moderate-resolution devices is a 
significant challenge. 


Line Weights 


A fascinating distinction between low-quality images and high-quality one is often the 
variation of line weights. Uniform weights are a signal that little effort was made to 
distinguish the importance of information. (Of course, uniformity of line weight here does 
not refer to the poor drawing capabilities of pen plotters with nearly exhausted pens.) 
Specifying thin lines for dimension lines, axes of graphs. or highlight lines is a common 
method for demphasizing them. Using thick lines for important objects or outlining the 
shape or curve with a thick line helps direct the viewer's attention. 


Varying the line weight introduces difficulties in properly drawing lines that intersect. 
and in finishing the ends and joints of line segments. The page description languages 
discussed later both provide extensive controls for specifying these attributes. 
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Contrast and Dynamic Range 


The contrast and dynamic range of an image indicates the quality of the reproduction 
process. The blackness of the blacks, the whiteness of the whites, and the distribution of the 
grays in between are an issue here. Copy-quality problems with some output devices reduce 
the achievable contrast. High-resolution devices that output to high-contrast photographic 
media are designed to avoid these problems. Such devices are typically graphic arts image 
setters or process cameras. : 


Digital images that contain scanned or sampled data must be produced with concern for 
the tone values that result from halftoning or dithering algorithms. Compensation techniques 
to ensure a linear response are necessary to avoid compressing all the information into the 
white or black region of the tone reproduction curve. 


Colors and their Reproduction 


The use of color introduces many significant challenges. The perceptual impact of color 
is well-known, but the control over that impact is not fully understood or practiced. It is 
impossible to explain the garish choice of fully-saturated primary and secondary colors on 
many colored samples produced by manufacturers of color devices. Well-coordinated use of 
color requires subtle control over the color imaging process. Faithful reproduction of color 
across different devices and media requires continued research as indicated by Maureen 
Stone’s paper Color, Graphic Design, and Computer Systems, \ater in these notes. 


Visual Interest and Design Consistency 


Visually interesting images are a challenge. Creative people can produce interesting 
images in any media, including those produced with computer-based tools. However, such 
automated tools lend themselves to producing uninteresting images faster and with more 
regularity! The tools do not increase the creative content, although they may liberate creative 
people to attempt more creative images. 


Illustrations prepared for graphic arts quality publications are often edited and managed 
to achieve a certain appearance. This control and discipline must be considered in using 
documentation graphics tools. Harmonizing the choice of typefaces. the selection of line 
weights, the variation in colors, and the range of interesting shapes is a necessary supervisory 
task. Graphic designers receive professional training in this discipline. When amateurs 
attempt similar illustrations, they should rely on the professionals for advice or use tools that 
incorporate the disciplines of the professional. 
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Notes on Marking 


CHARLES BIGELOW 


Stanford University 
Bigelow & Holmes 


The electronic document, like traditional kinds of documents, 
is based on writing. Writing is a graphical form of language, 
though its exact linguistic nature is subject to debate, even after 
5,000 years of literate history. Depending upon the philosoph- 
ical perspective of the analyst and on the way that writing is 
used in a given context, a written text can be considered either 
as a recording of speech, in which case it is a subsidiary level of 
linguistic representation, or as a direct expression of language, 
in which case it is an alternative to speech, but expressed visu- 
ally rather than aurally. 

The different modes of perception used by speaking and writ- 
ing make for important differences between speech and text. 
The acoustic stream of speech is temporal and linear, but the 
graphical image of text is spatial and planar. Writing has al- 
ways been durable, but speaking was ephemeral until the devel- 
opment of acoustic recording technology. Moreover, although 
reading is a sequential decoding of a text image, text elements 
can also be perceived simultaneously, whereas speech sounds 
are perceived sequentially. 

An analysis of the text image reveals some aspects that rep- 
resent various elements of speech and language, and other as- 
pects that appear to be artifacts of the graphic medium, though 
sometimes such distinctions are subtle. 

Electronic documents in their graphical expression use the 
form of writing called typography. A fundamental! principle of 
typography is the dual nature of the image: the complementary 
relationship between figure and ground (positive and negative 
Space) at each level of organization. At the level of the letter, 
there is the marked form, and the unmarked space inside it, 
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traditionally called the counter. At the level of the word, there is 
the grouping of letters, and the spaces between them, or letter- 
spaces. At the level of the line, there is the word space. At the 
level of the column, there is the space between lines, or leading. 
At the level of the page, there are the spaces separating and 
surrounding the columns, or gutters and margins. 

When a graphical and linguistic unit coincide, the ground 
space is significant; it marks some element of speech or lan- 
guage. For example, in modern English orthography, the word 
space differentiates two words. When the graphical and linguis- 
tic units do not coincide, the space still serves a purpose; it 
organizes the text in the plane of the page, for the reading pro- 
cess. For example, the spaces at the beginnings and endings of 
lines define a column as a coherent body of text, though they do 
not necessarily have any linguistic significance. Other uses of 
the ground may have significance at higher levels of text orga- 
nization, such as paragraph indentations, or the line endings, 
indentations, and line spacings of poetry. 

Beyond the page, the text of a document is organized by 
higher level linguistic groupings, such as chapters, by symme- 
try principles, such as repetition, pattern, homology, etc., and 
by conceptual structures, as seen in the indexed array of pages 
used in the codex form of books. 

The structure of a writing system usually emphasizes a par- 
ticular level of language, isolating certain elements for repre- 
sentation and ignoring others. Alphabetic writing represents 
phonemes. The Graeco-Roman alphabet represents both con- 
sonantal and vocalic phonemes, whereas the traditional He- 
brew and Arabic alphabets represent consonants only (though 
there is some provision for vowels). Logographic writing rep- 
resents words or morphemes. Chinese writing and Japanese 
Kanji (based on Chinese) are logographic. Syllabic systems and 
hybrid systems are also used. 

The nature of the signs used to represent these linguistic el- 
ements can be iconic or symbolic, in the semiological termi- 
nology developed by C. S. Pierce, or, in the semiological ter- 
minology of De Saussure, motivated or unmotivated. Many 
writing systems began with iconic or motivated signs which re- 
sembled some object, and which later evolved into symbolic 
or unmotivated signs that were abstract shapes. The histo- 
ries of cuneiform, hieroglyphics, the alphabet, and Chinese and 
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Japanese writing exhibit this tendency, although to different 
degrees. 

At the level of individual signs, the alphabetic system has 
become abstract and symbolic, but this has encouraged the 
evolution of an additional dimension of signification, that 
of typographic variation. Since the 15th century, the typo- 
graphic alphabet has become successively richer and more var- 
ied through a process of amalgamation in which stylistically 
different alphabetic forms are joined together in typeface fam- 
ilies. Formal oppositions like majuscules and minuscules (cap- 
itals and lower-case), or roman and italic, define dimensions of 
graphic contrast that are used to mark significant features of a 
text. 

The antiquity of the amalgamation determines the strength 
of the bond between the variations, probably because the more 
ancient formal distinctions have had more time and opportu- 
nities to become incorporated into the standard semantics of 
the text. CAPITALS and lower-case were first conjoined early in 
the 15th century, and today it is almost obligatory to include 
both in a standard character set like the ASCII set ora typewriter 
font. Roman and italic types were first conjoined in the middle 
of the 16th century, and today italic is a necessary companion 
to roman in most literary texts. Normal and bold weights were 
first conjoined in the 19th century, and today boldface is a fre- 
quent supplement to italic in many typographic texts. Seriffed 
and sans-serif types began to be used together a few decades 
ago, and appear to be on the way to forming an extended family 
of typographic variation. 

In English orthography, there are reasonably well-defined 
rules for the use of capitals and lower-case, though these change 
with time. Capitalization in an 1 8th century book is often differ- 
ent than it would be today. Most publishing houses and period- 
icals have manuals of style that set forth rules for the standard 
usage of italic and bold faces, though these rules may be less 
rigid than those for capitalization. 

Semantic interpretation of graphical variation appears to be 
more effective with abstract, symbolic systems than with con- 
crete, iconic systems. This can be seen in modern workstations 
as well as in historical scripts. In a workstation user-interface 
that supports both icons and text, the icons will generally be 
graphically invariant (other than by video-reverse), whereas the 
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text will often support variations such as italic and bold face. 

The graphical invariance of the icons in this context suggests 
that they are not the same kind of writing as alphabetic or logo- 
graphic systems. This is borne out by the non-syntactic nature 
of most screen icons - unlike letters or characters, they can- 
not be combined into larger and more complex expressions - 
and by their lack of standardization; the forms of icons vary 
from system to system. It may be that screen icons are an in- 
choate writing system still in the process of formation, or that 
screen iconography will remain a “pictographic” system (to use 
an older terminology) or a “semasiographic” system, to use a 
more recent terminology for signs that represent objects rather 
than elements of language. 

The different graphical aspects of the text image each deserve 
a thorough analysis. The basic principles of text organization 
are part of the passive graphic vocabulary of most literates, 
but an active, refined understanding of such principles was, in 
traditional literate societies, the province of the professional 
scribe, and today is the domain of the typographer and graphic 
designer. This understanding has in the main been intuitive, 
though in comparison to the other visual arts, typography is 
more susceptible to rational analysis. 

The following paper, “The Design of Lucida: an Integrated 
Family of Types for Electronic Literacy”, is an attempt to provide 
an analysis of the design decisions that determine the visual ap- 
pearance of a typeface. As such, it is focused on a relatively low 
level of the typographic image, the letterforms. Analyses of the 
visual principles of the higher levels of text organization will be 
necessary if the design and formatting of electronic documents 
is to become part of computer science as well as a part of the 
graphic arts. 


Copyright © 1986 by Charles Bigelow. 
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{ Reprinted from Text Processing and Document Manipulation, 
copyright © 1986, Cambridge University Press. ] 


The Design of Lucida®: 
an Integrated Family of Types 
for Electronic Literacy 


CHARLES BIGELOW & KRIS HOLMES 


Stanford University 
Bigelow & Holmes 


ABSTRACT 

Electronic printing and publishing transform traditional analog letter- 
forms into digital pixel patterns. At medium and low resolutions, alias- 
ing degrades the legibility of digital types. To maintain typographic 
quality in an aliased image environment, the design of readable digital 
typefaces requires rational, technical considerations as well as intu- 
itive, artistic processes. The graphic functions of typefaces in electronic 
publishing also require rational structuring. Documents can be more ef- 
fectively designed when typographic variations of weight and style are 
integrated into a systematic design family. 


OMAGGIO A GRIFFO 

One by one, each seriffed black figure 

is transfixed in the luminous white field. 
As the gaze travels across the page, 

it goes like the wind on a summer night, 
Blowing the clouds in their atmosphere, 
Light in the face of the full moon, 

and dark in the depths of the starry sky. 


In designing the Lucida typefaces for laser printers, digital type- 
setters, and CRT screens, we found inspiration in the work of 
Francesco Griffo, a Renaissance typeface designer who flour- 
ished at the end of the 15th century during an era of profound 
change in the technology of literacy. 

At a time when other type-founders were attempting in vain 
to copy manuscript writing hands, Griffo was the first punch- 
cutter to actively explore the design possibilities of the en- 
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graved, typographic letter. He created letterforms that were no 
longer handwriting, but that nevertheless stemmed naturally 
from principles inherent in the alphabet. The types he cut for 
the Venetian printer Aldus Manutius profoundly influenced the 
history of typography [Mardersteig69] [Morison73]. 

The fundamental problem that Griffo faced was howto main- 
tain clarity and vivacity of the text image in a radically changed 
imaging technology. We face the same problem today. 


1. Transformations of Letterforms. 

1.1 From Ductal to Sculptal to Pictal. 

Griffo helped transform the ductal handwritten letter of the 
scribe into the sculptal typographic letter of the printer. The 
analog typographic letter became the publishing standard for 
five centuries, but it is now being replaced by the pictal elec- 
tronic letter in digital printing and electronic publishing. From 
the reader’s point of view, the main flaw of digital typography 
is the degraded appearance of the typefaces [Bigelow81, 82]. 


1.2 Aliasing. 

Current digital screens and printers lack resolution sufficient to 
render traditional analog letterforms adequately. A digital let- 
ter is typically produced by sampling an analog letter. Low and 
medium resolution devices like CRT screens and laser printers 
do not provide enough samples in a letter image to reproduce 
all the information in the original design. “Under-sampling” 
causes loss of high-frequency information and “aliasing”, a 
form of digital noise. The contours of an aliased letter are dis- 
rupted by “jaggies”; its proportions, weight, and spacing are dis- 
torted, and its fine details obscured [Bigelow,Day83]. 

To understand how to design letters for rasterized reproduc- 
tion in the one bit per pixel technologies common today, it is 
helpful to consider the common ad hoc methods of ameliorat- 
ing aliasing in letterforms. These are bitmap editing and outline 
deformation. 


1.3 Bitmap Editing. 

A letter is first scan-converted to a raster from an analog image 
or a digital outline. Then a designer “edits” the resulting raster 
image by adding, deleting, or rearranging pixels. In this process, 
the designer’s intuitive, internalized model of what the letter 
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image should be is mapped onto the actual raster image, mod- 
ifying the arbitrary output of an electro-optical or algorithmic 
process. This is often effective because the designer actually 
sees what she is doing, and thus intuitively tunes the image to 
the characteristics of the human visual system in accordance 
with knowledge of canonical letter shapes. 

Bitmap editing has the disadvantage of being a low-level, lo- 
cal manipulation of the letter image. An alphabet design also 
contains global information that bitmap editing cannot directly 
address. The alphabet structure remains a concept in the mind 
of the designer, but it is not a part of the data representation. 


1.4 Outline Deformation. 

A higher-level method of ameliorating aliasing is to deform 
letter images globally throughout an alphabet before scan- 
conversion. When letters are represented as outlines, the points 
defining the outlines can be adjusted in relation to the output 
raster so that certain letter features will fall nicely on raster val- 
ues. For example, the edges of all vertical stems in an alphabet 
could be deformed to fall on integer values of the raster, and 
constrained to have the same pixel thickness [Karow83]. 


1.5 Noise & Signal. 
Bitmap editing and outline deformation can have similar re- 
sults, and it is possible that they are related in an abstract way. 

Bitmap editing appears to bea way of rearranging the aliasing 
noise in the rasterized letter image. Because editing does not 
increase resolution, aliases remain from under-sampling, but 
their spatial positions in the image have been moved. 

In the frequency domain, bitmap editing may be a way to 
shift the frequencies of the aliasing noise further from the fun- 
damental frequencies of the signal. The noise would then mask 
perception of the signal to a lesser degree. The letters remain 
distorted, but appear less so because certain lower frequency 
components (e.g. stems) have a more regular relationship to 
the raster. 

These are merely conjectures, but efforts to optimize digi- 
tal type could benefit from a rigorous analysis of the effects 
of scanning, editing, and deformation on the frequency spec- 
tra of letterforms at various resolutions. Typefaces are a spe- 
cial kind of image that could benefit from refined methods of 
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anti-aliasing, especially on displays with multiple bits per pixel 
[Kajiya,Ullner8 1] [Dippe,Wold85]. 

Deformation of outlines appears to accomplish the same 
thing as bitmap editing, though prior to scan-conversion, by 
deforming the original image to match more closely the period- 
icity of the sampling grid. Major letter features become aligned 
with the raster, and thereby exhibit less obvious distortion. 

In both cases, important information about the structure of 
the alphabet is missing from the basic font data, whether raster 
or outline, and must be supplied from an external source (a de- 
signer). An explicit model of alphabetic structure could support 
automatic identification of letter features and their parameter- 
ized deformation to optimize scan-conversion of letterforms. 

It may be that such a model could take the form of a more 
elaborate data structure foreach outline font, or perhaps canon- 
ical models could be developed for the kinds of alphabet design: 
either by style, i.e. seriffed, sans-serif, etc.; or by class, i.e. Latin, 
Greek, etc. Much of the research in this area is currently em- 
bedded in proprietary research or commercial systems [Plass] 
[Sheridan] [Warnock]. 


1.6 Rationalization. 

To facilitate digitization and enhance image quality, alphabet 
designs for electronic printing and publishing should be more 
explicit and more rationalized than traditional analog type- 
faces. A structural model of the alphabets should be communi- 
cable both to designers editing bitmap fonts, and to algorithms 
performing automatic transformations on digital outlines. 

An outline representation with precise specifications of pro- 
portions, parameters, letter parts, and other features of the de- 
sign is one way to implement alphabetic structure. Philippe 
Coueignoux has described one approach to a “syntactic” font 
description [Coueignoux75]. A different approach to a param- 
eterized, structured alphabet design, based on a “pen-drawing” 
model rather than outlines, is described by Donald Knuth 
[Knuth80O]. 


1.7 Tuned Features. 

The features of letterforms intended for digital printing and 
display should in general! be tuned to the marking characteris- 
tics of digital devices. This is difficult because different devices 
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may have contradictory effects, and new kinds of technologies 
are continually being Bpvelnr es: 


1.8 Systematic Typography. 

Typeface design is not isolated from the literate culture that 
uses printing systems. Electronic document production has cer- 
tain typographic requirements that are different from those of 
traditional publishing; these will proliferate as the digital tech- 
nology becomes more prevalent. Among the present needs are 
simplicity and clarity in typeface families, so that authors and 
editors may achieve greater fluency in the symbolic language 
of typographic signs, without protracted study. Typographic 
variation should be coherent and systematic. 


2. The Design Concepts of Lucida 

Following these observations and conclusions, we designed the 
Lucida family of typefaces to provide, at the perceptual level, 
acceptable legibility in an aliased image environment, and, at 
the semiological level, a functional system of typographic vari- 
ations. 

Although types are designed at a large size (the master out- 
line characters of Lucida are digitized at an em square of 168 
x 168 mm), text is read at small sizes. In designing Lucida, we 
worked at several levels of the letter image. 


2.1 Form, Pattern, & Texture. 

At the large size of the master design, a letter form is comprised 
of sculpted contours delineating dark forms and light counter- 
forms. At a middle size of headlines, letters in combination 
make patterns out of the quasi-symmetries of repeated forms. 
At the small size of text, a complex texture emerges from the 
interaction of the letter features en masse. 

We design the features of a typeface at the level of forms, but 
the character of the face emerges at the level of texture. For the 
designers, there are often surprises when a type design is first 
proofed: rational decisions about formal! properties turn out to 
have irrational effects when the texture is perceived. This is 
part of the excitement of typeface design. 

In its features, Lucida is intended to be a font- independent 
design. This is not the same as a device-independent font. A 
type design is a visual concept, whereas a font is an implemen- 
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tation of that concept in software or hardware. Traditional type- 
faces were tuned to the typefounding and printing processes. 
We sought to tune the letterforms of Lucida to digital image 
processing and reconstruction. 


2.2 Weight. 

An index of the weight of a normally proportioned typeface is 
the ratio of the thickness of a straight stem to the height of the 
lower-case ‘x’. The shade of the gray texture of a face is termed 
“color”. Our survey of several traditional and popular text type- 
faces showed a variation of stem to x-height ratios from 5:1 to 
6:1. Types with ratios toward 5:1 are darker; those with ratios 
toward 6:1, lighter. 

In laser-printing, the polarity of the marking engine becomes 
an important factor. White-writing engines tend to erode the 
contours of the letterforms, lightening the color of the text. 
Black-writing engines tend to spread the contours, darkening 
the text. 

Another factor is the use of laser-printer output as masters 
for offset-lithography or photocopying. These processes fur- 
ther darken or lighten the text image. 

On screens, the writing spot which reconstructs the bitmap 
letterforms also changes the weight of the text image. The per- 
ceived weight of screen text is influenced by the intensity con- 
tour of the spot and the size of the spot in relation to the resolu- 
tion of the raster. The reconstruction filter effects are strongest 


at the pixels along the contour of a letterform. Small sizes and 


lower resolutions, where the contour is a greater part of the total 
image, are more strongly affected. 

Numerically, the weight ratio of a face necessarily varies 
from size to size because stems and x-heights are rounded-off 
to integer pixel values at each raster size. We examined the 


amounts of error in ideal weight ratios caused by round-off at 


common sizes and resolutions. 

These observations led us to estimate that the weight of the 
text image seen by the reader would on the average vary about 
10% from the original design, and in the worst cases by as much 
as 25%. To make the typeface resistant to extreme variations in 
color, we designed the normal weight of Lucida with a stem to 
x-height ratio of 5.5 to 1. 


SIGGRAPH '86 TUTORIAL COURSE NOTES 





THE DESIGN OF LUCIDA 21 


2.3 Contrast. 

Contrast is the ratio between the thick and thin parts of let- 
ters. Serifs, hairlines, and joins are thins; vertical stems, curved 
bowls, and main diagonals are thicks. The contrast of tradi- 
tional text types ranges from a high of 5:1 to a low of 2:1. 
The high-contrast faces appear delicate and brilliant; the low- 
contrast faces, sturdy and solid. 

High contrast faces are believed to be more difficult to read 
than medium and low-contrast designs. Moreover, thin hair- 
lines and serifs are more susceptible to breakage and erosion 
by printing processes. Text degraded by broken thins is espe- 
cially objectionable because the letterforms lose connectivity 
and become more difficult to discriminate. Marking effects that 
change weight change contrast even more, because erosion or 
expansion of thin hairlines is proportionally greater than for 
thick stems. 

To prevent of loss of hairlines and serifs on white-writing 
engines and bitmap screens displaying black text on an illumi- 
nated background, we chose a low contrast of 2:1 for the basic 
Lucida seriffed designs. This decision in favor of robustness 
also influenced the design of joins and serifs. 


2.4 Joins. 

Black-writing printers and reverse-video displays increase the 
thickness of thin elements. In particular, the white triangular 
counter-forms produced where an arch joins a straight stem, as 
in an ‘n’, tend to be filled in when letter contours are embold- 
ened. Therefore, when joins are kept sturdy to prevent erosion 
by white-writing printers, counters are susceptible to clogging 
by black-writing printers. 

Our solution to this antinomy was to branch the joins rela- 
tively deep on the stems, so that the triangular counter-form of 
the master design has a generous area. Hence, even when the 
counter is filled to some degree, it remains open enough to be 
acceptable. After we had designed this feature in Lucida, we dis- 
covered that Fleischman, an 18th century punch-cutter, used a 
similar technique in cutting small sizes of types intended for 
journal publishing [Carter3 7] [Enschede08]. 

To further prevent clogging, we reduced the thickness of the 
stem close to the join by making the segment of the stem edge 
closest to the join cut into the stem ata slight angle. The amount 
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of cut is determined by the position of a single point. When the 
join is in danger of clogging at small sizes, this point can be 
shifted toward the interior of the stem and the cut widened. 
When the stem should appear straight at large sizes, the cut can 
be narrowed. 


2.5 Serifs. 

Our design experiments showed that long, thick serifs give a 
typeface a stolid appearance and a dark color. We wanted thick 
serifs to resist erosion, but we didn’t want too dark a color. Ac- 
cordingly, we reduced the total area of the serifs by abbreviating 
their lengths to one-half of the stem thickness. 

Serif shapes also posed problems. When letters are reduced to 
coarse bitmaps at low and medium digital resolutions, brack- 
eted serifs are reduced to slab serifs. When letters are repre- 
sented as outlines, curved brackets are complex details that can 
require extra time to digitize, more space in storage, and more 
time to scan convert. | 

A slab serif would have simplified the alphabet design with- 
out appreciable loss of elegance at low resolution, but at high 
resolution, the slabs would have seemed monotonous. We 
chose a middle path, chamfering the serif and stem with slight 
diagonal taperings. At low resolutions, these serifs can be 
rounded-off to simple slabs, but as resolution increases, the 
chamferings provide variations in weight and thickness that 
enliven the printed texture. 

These polygonal serifs can be compactly and precisely repre- 
sented by vectors. In a font format that provides for adjustment 
or deformation of letter features to enhance scan-conversion, 
the polygonal serifs are more diagrammatic than absolute, be- 
cause the points on the vertices can be moved by algorithm or 
by designer specification to enhance the appearance of the re- 
sultant bit image. 


2.6 X-height. 

The x-height (height of lower-case ‘x’) of a typeface is an index 
of the apparent size of a typeface. Most of the shape informa- 
tion in the lower-case alphabet is carried by those parts of the 
letters that lie between the baseline and the x-line. Typefaces 
with large x-heights look bigger than those with small x-heights, 
even when the actual body sizes (total cell height from bottom 
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of descender to top of ascender) are the same. | 

Low-resolution systems entice the designer toward large x- 
heights because the complex middle portions of the lower-case 
need more resolution than the relatively simple ascenders and 
descenders. However, if the ascenders and descenders are re- 
duced too far, the complex lower loop of the humanistic ‘g’ will 
be distorted, and the shapes of other letters (‘h’ - ‘n’, ‘b’ - ‘p’) 
will become indistinguishable from each other, destroying the 
legibility of the face. Thus, there is an upper bound to the size 
of the x-height. 

The x-height of Lucida is 52% of the body. This allows more 
detail to be devoted to the lower-case letter shapes, and permits 
Lucida to pack a relatively large amount of legible text informa- 
tion into a relatively small area. Lucida set at 9 point seems as 
large as many other faces at 10 or 11 point. Where page space is 
limited and text economy important, this increase in apparent 
size is a definite advantge. However, where economy of space is 
not crucial, we prefer to see Lucida composed with extra points 
of “leading” (white space) between lines, to give the page a more 
open and relaxed texture. 


2./ Fitting. 

The positive (black) and negative (white) shapes in a letterform 
are equally important. Traditional typefaces are fitted so that 
the spaces between letters are visually equivalent and harmo- 
nized with the white counters inside the letters. Aliasing dis- 
torts the interletter white spaces as much as the black shapes 
of the letters, causing an irregular texture with dark collisions 
of some characters and empty voids between others. 

In advertising typography, tightly kerned letter spacing 
draws attention to texts that are otherwise empty of content. 
However, when this kind of spacing is attempted on low reso- 
lution printing systems, round-off error of letter widths creates 
an objectionable, splotchy texture. 

The best printed books of the last 500 years have type- 
faces that are regularly, harmoniously, and often openly spaced 
[Tschichold66]. We followed these models when fitting the Lu- 
cida designs for laser printer resolutions. At typesetter resolu- 
tions, where tighter fitting can be accomplished without losing 
a regular rhythm, Lucida can be more closely spaced. 
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2.8 Capital Height. 

Our traditional capital forms were developed by the Romans, 
and our lower-case (minuscules) by Carolingian scribes. Capi- 
tals and lower-case were separate alphabets until the early 15th 
century, when they were first amalgamated into a single duplex 
alphabet by the Florentine humanist and scribe, Poggio Bracci- 
olini. At the end of that century, Francesco Griffo fine-tuned 
the relationship between typographic capitals and lower-case 
by reducing the relative size of the capitals. 

Documents printed by laser printers often are dominated by 
capitals, usually for retrograde reasons left over from mono- 
case terminals and printers. Following Griffo’s lead, we made 
the Lucida capitals slightly shorter than the ascenders of the 
lower-case so that capitals would not be too emphatic and dis- 
tracting when used heavily in a text. As well as reducing their 
height, we also gave the capitals slightly narrow proportions 
to provide even greater space economy when capitals are used 
extensively in a document. 

We also observed that weight differences between capitals 
and lower-case are often exaggerated at low resolutions, when 
a one pixel increase in stem thickness will make the capitals 
seem much darker than the lower-case. Therefore, we made the 
capitals similarin weight to the lower-case to keep the alphabets 
harmonious at lower resolutions. 

The design of capitals is also affected by the orthographies 
of different languages. De-emphasized capitals are often pre- 
ferred for German language texts that make extensive use of 
capitals. However, we also anticipated that some French and 
English typographers would request more robust capitals, in 
keeping with certain national printing traditions and cultural 
views. We therefore designed an alternate set of capitals that 
are heavier in weight, especially for use on higher-resolution 
devices. 


3. The Structure of an Extended Family 

3.1 Teleology. 

The history of typography shows a tendency for typeface de- 
signs to become united into families. Capitals and minus- 
cules (lower-case) were united in the early 15th century; roman 
and italic in the 16th century; normal and bold weights in the 
19th century. The first typeface family to include both seriffed 
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and sans-serif alphabets was Romulus, designed by Jan van 
Krimpen in the 1930s. 


3.2 Dimensions of Typographic Space 

Lucida continues the historical trend toward extended design 
families by structuring several letterform styles in one family: 
roman vs. italic; normal vs. bold; seriffed vs. sans-serif; pro- 
portional vs. mono-spaced; Latin vs. Greek. The family is thus 
a system of oppositions which can be thought of as defining a 
multi-dimensional space of typographic variation. 

These contrasting variations are precisely aligned in their 
vertical letter proportions and standardized in weights. A 
change along one dimension leaves most other characteristics 
of the typeface unaltered, with the exception of letter widths. 
Widths are similar, though not quite identical between roman 
and italic, and seriffed and sans. The bold weights are propor- 
tionally wider than the normal weights. 


3.3 Semiology of Type Styles. 

Each graphic typeface variation can be used to signify or mark 
some semantic aspect of the text. Roman may be used for nor- 
mal text, italic for differentiation, bold for emphasis, bold italic 
for emphatic differentiation, sans-serif for technical text, script 
for casual notes, and so forth. 

Type styles used as signifiers are part of the “passive vocab- 
ulary” of typographic literacy; readers understand them, but 
type variations are not necessarily part of the “active vocabu- 
lary” of every author. Like other languages, a graphic language 
of formal variations requires practice for the user to become 
fluent. Initially, one follows conventional styles of typography, 
but more imaginative expression becomes possible as one be- 
comes more familiar with the medium. 

The harmonization and simplification of the Lucida family 
is intended to make the Lucida typefaces easier for authors to 
use intuitively. When typographic documents are formatted 
with systems like TROFF, TeX, and Scribe, graphic variations 
should be clear and comprehensible to the author as well as to 
the reader. Clarity and simplicity of variation makes it easier 
for an author to use typefaces expressively and powerfully. 


DOCUMENTATION GRAPHICS 

















26 BIGELOW & HOLMES 


3.4 Modularization. 

Another effect of harmonization is to make typefaces easier to 
implement. Within each Lucida face, many elements such as 
stems, serifs, and bowls are repeated. Should it be necessary 
to save space in a font implementation, characters can be rep- 
resented as assemblages of component parts rather than sep- 
arate characers. Across the family, different designs may also 
share certain features. Seriffed and Sans-serif faces of the same 
weight and stress share the same stems and outer contours of 
bowls. The entire Lucida family could be further compacted by 
exploiting these similarities. 

Modularization of design also made it easier to produce the 
faces, both in outline and in raster format. More often than 
not, the principal advantage of rationalization and modulariza- 
tion was simply a precise understanding of the design param- 
eters. This was often reasurring when we were caught in the 
coils of the magnitude of the actual production. The Lucida 
family currently includes 1,500 outline masters and 12,000 
raster characters, with more in production. In the midst of this 
daunting multiplicity, acoherent design structure that could be 
expressed in logical and numerical relationships made it easier 
to remember, communicate, and record what a given letter im- 
age or group of images was supposed to look like at any given 
size on a variety of displays. 


3.5 Screen Fonts. 

The principles that shaped the Lucida designs for printers simi- 
larly influenced the design of bitmap versions of Lucida for CRT 
displays. At screen resolutions of 75 and 100 lines per inch, all 
sizes of the Lucida fonts required bitmap editing. 

From 6 to 22 pixels per body, the fonts were mainly con- 
structed by hand, using a bitmap editing system. At these low 
resolutions, there are so few pixels in each letterform, and the 
position of each pixel is so crucial, that only the experienced 
eye of the designer can make an optimal judgement. For sizes 
of 24 pixels per body size and greater, the fonts were produced 
in two stages. Digital spline outlines were first deformed to the 
given raster, using the Ikarus software system. This provided a 
general idea of the charactristics of the font at a given size. The 
resulting rasters were then hand-edited to optimize the fonts 
on the screen. 
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Because of their low resolution, screen fonts cannot be ex- 
act reproductions of their higher resolution counterparts. We 
wanted the screen fonts to be usable in “WYSIWYG” systems 
along with Lucida on printers, but also to be useful on their 
own, when optimized for legibility on the screen without the 
procrustean distortion to match spacing values of higher reso- 
lution devices that the simple-minded WYSIWYG systems usu- 
ally demand. To emphasize that the screen fonts can exist as 
independent entities, we christened them with the name Pellu- 
cida, which connotes that the designs are related to Lucida, but 
optimized for “pel” based screen displays. 


4. Conclusion 

Typography holds a particular fascination for the inquiring 
mind, and this is nowhere more evident than in the realm of 
electronic printing and publishing. Typography is abstract, 
achromatic, and two-dimensional, yet it constitutes a complete 
aesthetic microcosm accessible to the literate intellect. Type- 
faces exist only to serve language, yet their art is as subtle as 
music or painting. The forms of the letters are intuitive and 
mystical, yet they are ruled by numerical principles and sys- 
tems of measurement and proportion. The patterns of the al- 
phabet are arbitrary and historical, yet they reveal a complex 
symmetry and an intricate evolution. The texture of a page is 
completely visible, yet how it emerges from the interaction of 
its myriad components remains obscure. 

We designed Lucida to meet the practical needs of contempo- 
rary electronic publishing, but it was for us also an exploration 
of that aesthetic realm at the intersection of science and art. 
Because Lucida was, to our knowledge, the first original type- 
face family produced for digital printers and displays, we nec- | 
essarily based much of its design on principles more than on 
precedents. Some of those principles had to be invented as we 
worked on the design and encountered puzzles for which there 
were no ready answers. Yet the design is not completely novel, 
nor can it be wholly reduced to logic, for many of the princi- 
ples were distilled from alphabets created in previous eras by 


visionary artists who bequeathed us their letterforms but not 
their reasoning. 
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A Device Independent Graphics Imaging 
Model for Use with Raster Devices 


John E. Warnock and Douglas K. Wyatt 
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Abstract: In building graphic systems for use with raster devices, it is difficult to 
develop an intuitive, device independent model of the imaging process, and to 
preserve that model over a variety of device implementations. This paper describes an 
imaging model and an associated implementation strategy that: (1) integrates scanned 
images, text, and synthetically generated graphics into a uniform device independent 
metaphor; (2) isolates the device dependent portions of the implementation to a small 
set of primitives, thereby minimizing the implementation cost for additional devices; (3) 
has been implemented for binary, grey-scale, and full color raster display systems, and 
for high resolution black and white printers and color raster printers. 


CR categories: 1.3.2, 1.3.4, 1.3.6, 1.4.1. 


Key words and phrases: Device independence, Graphics systems, Raster graphics, 
Sampling 


Introduction 


The work described in this paper is designed for a multi-application programming 
environment where programmers use raster display devices to provide the visual 
communication links between users and systems. The displays are used for simple typescript- 
Style text applications as well as for more involved applications requiring drawings, scanned 
images, and other complex combinations of graphics and text: a music composition system. a 
general window management package for a programming environment. a high quality 
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document design system, a VLSI design system, and a graphics arts design package. 


In an environment that supports such diverse graphic user interfaces on a variety of 
display devices, it is desirable to maintain a flexible unified graphics imaging model and an 
associated programming interface, independent of display devices, with which all the 
application programs can work. 


This paper describes an imaging model and its programming interfaces, discusses the 
advantages in using such a model, and outlines a basic implementation strategy. 


Raster Devices 


The class of raster devices encompasses a wide range of displays, plotters, and printers. 
These include full color (24 bit per pixel) displays, grey level displays, simple low resolution 
binary (1 bit per pixel) displays, electrostatic plotters, high resolution film recorders, and laser 
printers. Raster devices, because of their potential ability to display a rich set of images, serve 
as a useful class over which to define a device independent programming abstraction. 


Broadening the class of raster devices to include other kinds of graphic devices (vector 
drawing displays, pen plotters, or storage tube displays) can lead to problems in defining the 
imaging abstractions. Either the image types become restricted or the imaging metaphors 
become strained and unnatural. For example, the implementation of solid areas on some 
vector drawing devices is impractical. Restricting the set of devices to raster devices allows 
the imaging metaphor to remain simple, consistent, and efficient. 


Device Independence 


Device independence, in the context of using raster devices, can be defined in several 
ways depending on the level of abstraction desired. 


One definition dictates that the imaging model provides an abstraction of how an image 
ideally looks on a perfect medium; this model abstracts the appearance of the image. The 
implementation for each specific display must mimic the appearance of an ideal image to the 
best of its ability. This kind of device independence attempts to maintain global image 
properties in spite of wide variations in display type. For example, a device that can show 
grey values might render the appearance of color values by substituting appropriate grey 
values. A binary device might render colors with stipple patterns that give a visual impression 
of grey values. 


Other kinds of device independent abstractions can be less strict. [mage representations 
may not model image appearance but instead describe some form of information content. In 
this case, the implementation of a given device is only required to convey certain information 
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content of the application, and may not be constrained in any way to the appearance of the 
image. For example, a black and white device implementation might choose to display colors 
with iconic labels rather than intensity levels. With this kind of abstraction, few constraints 
are put onto particular device implementations. One consequence of this latter kind of 
abstraction is that the programmer cannot have precise expectations of how a device 
implementation will attempt to represent images. 


Device independence benefits the implementors of a graphics system as well as its clients, 
since the bulk of the system can be shared across all devices. Only a small portion of the code 
need be concerned with a specific device type. If the interface to this device-specific code is 
well designed, implementing a new device type requires minimal programming effort. 


The Imaging Model 


The imaging model described here is designed for applications related to the typesetting 
and graphics arts industry, where image appearance is vital. For this reason the model 
_abstracts the geometric and color properties of an image. In taking this approach, the imaging 
model makes two value judgements: first, that global image fidelity is important; second, that 
it is valuable for the application to be able to rely on a device implementation to render 
images as accurately as possible. It should be noted at the outset that for a number of 
applications this choice may be inappropriate. 


The imaging model specifies how geometric shapes and colors are combined. It follows a 
metaphor that loosely corresponds to the procedure used by a silk-screen printer: pushing 
colored ink through a stencil onto paper. The left side of Figure 1 illustrates this operation. 
The ink, above, is solid gray; the screen, in the middle, has an a-shaped opening; on the 
paper, below, the result is an a-shaped patch of gray ink. The artist can build a complex 
image by repeating this basic operation with different combinations of screens and inks. Ink 
laid down later may obscure ink laid down earlier. 


The programming interface presents a similar model. The programmer calls a series of 
procedures to define a stencil, and other procedures to define a source. Each primitive display 
procedure produces on the display the effect of pushing a given source through a given 
stencil. The programmer can build a complex image by calling a sequence of display 
primitives with different combinations of stencils and sources. 


Stencils may be represented in two forms: shapes and masks. A shape consists of a 
collection of closed piecewise analytic curves (straight lines and parametric cubics): these 
curves represent the outlines of holes in the stencil. A mask consists of a binary two 
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Figure 1: The imaging model. 


dimensional array; “ones” in the array represent holes in the stencil. There is no special 
representation for text. Characters are just letter-shaped stencils, which can be represented 
either as closed analytic outlines or as masks. 


Sources (inks) may be represented either as single colors or as multi-colored two 
dimensional sampled images. When the source is multi-colored, the imaging model is much 
more powerful than any analogous silk-screening operation. Picture a slide projector shining 
a general colored image through a stencil onto the paper. The right side of Figure 1 shows the 
result of pushing a multi-colored source through a stencil. 


Other properties of the imaging model lack good analogies in the silk-screening 
metaphor. The first of these is an additional level of stencil called a clipping region. The 
purpose of the clipping region is to restrict the area where ink is displayed regardless of what 
other shapes or masks are used. When a clipping region is specified, then only ink falling 
inside that region is displayed. Figure 2 illustrates the effect of a clipping region. 


Also unlike anything in the silkscreening process are the model’s general mapping 
facilities. Under control of the application. stencils and sources may be mapped 
independently through any linear transformation prior to display. Imagine rotating the 
stencil, or stretching a rubber stencil to expand or skew it; at the same time, imagine rotating 
the slide projector. or pulling it back to enlarge the image. The mapping facility gives the 
application program a great deal of flexibility in the composition of images. 
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Figure 2: A clipping region. 





Additional operators generate shapes that correspond to drawn lines and curves. These 
operators take trajectories (piecewise analytic segments, open or closed) and brush 
information (just another shape) and generate closed shapes that correspond to lines and | 
curves drawn with the given brush. The resulting lines and curves then act like any other 





stencils, and may have inks pushed through them. 


This model does not address issues concerning the display of projected three dimensional | 
objects. or issues dealing with complex conformal mappings. It is assumed that these kinds of | 
objects, if desired, are transformed into appropriate two dimensional imaging constructs prior | 
to display. 


Interfaces and Implementation 


The following description centers on the interfaces which define the boundaries between 
program components. A carefully designed interface effectively decouples its clients from the 
internal details of its implementors by defining a set of available operations. Each client 
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retains a pointer to the state information needed by the implementor, but uses only the 
interface-defined operations to manipulate that state. The implementation outlined here 
relies heavily on this notion of an interface to achieve device independence. 


The programmer who wants to display or print pictures is a client of the application 
interface. The operations defined in this interface allow the application programmer to 
construct images by combining various sources and stencils. This is the interface that presents 
the imaging model described above. Equally important, however, are the internal interfaces 
which separate the device-independent components of the implementation from the device- 
dependent components. These interfaces isolate most of the implementation from the 
peculiar characteristics of different display devices and image sources. 


Coordinate Systems 


One of the key ideas in making applications independent of devices is defining 
coordinate systems and isolating them from each other. Because pixel addressing conventions 
vary across different raster devices, it is particularly important to isolate the device coordinate 
system (DCS) from the application program view of the system. To achieve this isolation, the 
imaging model defines an intermediate coordinate system called the virtual coordinate system 
(vcs). This common coordinate system serves as the meeting ground for device 
implementations and user applications. The system implementor writing the device 
dependent code for a particular display is concerned only with the mapping between the VCS 
and the DcS. The application programmer using the imaging operators is concerned only 
with building images relative to the virtual coordinate system. 


Another aspect of coordinate systems often overlooked in building raster display sytems 
is the isolation of the source coordinate system (SCS) in which sampled image sources are 
defined. Sampled images are, in some sense, dual to display devices. They provide a raster 
input form for images in the same way that devices provide a raster output form for images. 
The application program should be no more concerned with scanning properties or 
coordinate system particulars of source images than with scanning properties or coordinate 
sytems of display devices; it should deal with images in the virtual coordinate system. 


To accomplish this, the imaging model adopts conventions for source independence 
analogous to its conventions for device independence. Knowing these conventions, the 
application can predict how images or masks will be mapped into the virtual coordinate 
system. Therefore, the application can manipulate and transform the image geometrically 
without regard to the scanning or resolution properties of the image or mask. 


Treating source images in this way gives a pleasing symmetry to the implementation as 
well. The interface to a raster input source provides the source’s boundary in the source 
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coordinate system, and provides a mapping from the source coordinate system to the virtual 
coordinate system. The interface to a raster output device provides a mapping from the 
virtual coordinate system to the device coordinate system, and provides the device's boundary 
in the device coordinate system. Given these interfaces, the implementation has all the 
information it needs to map coordinates directly from source to destination. and to compute 
the intersection of their boundaries. 


The following description of an implementation of the imaging model will illustrate how 
all the above concepts hang together. 


Application Interface 


Each application using the imaging model may invoke a collection of imaging operators 
through the application interface. These operators define mappings. clipping regions, shapes. 
colors, masks and image sources, and cause shapes to be displayed. Because of the many 
differences in languages and operating systems, only an indication of the framework of 
operators is given here. Additions are needed to fill in the details for a specific programming 
environment. 


The state information associated with an application is called the display context. As an 
application uses the display. the imaging operators use and modify the information in the 
display context. This information includes: 

l. An interface to a device. 

2. Thecurrent position (cpx.cpy) in the device coordinate system. 


3. The transformation matrix 7 that maps application defined shapes into the device 
coordinate system. 


4. The clipping region (CR). 


The notation used to indicate the procedure interface is of the form: 


ProcedureName: PROCEDURE [p7: PType1, 02: PType?, ...] 
RETURNS [r7: RType1, r2: RTypedz, ...]; 


Here it is assumed that each procedure takes input parameters (named p71, p2, ...) of 
various types, and returns results (named r7, r2, ...) of various types. Some of the type names 
should be obvious (e.g.. Real for floating point numbers): others (e.g.. Trajectory) are left 
undefined. For these undefined types. it is assumed that the implementation will define an 
appropniate data structure. The particulars of the data structures chosen are not important to 
this discussion. and need not be known by the application using the interface. 


The Device and Image types hold state information for particular display devices and 
Image sources. The interfaces to devices and images will be described below. Note. however. 
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that the application can treat them entirely as “black boxes”; it need not know even their 
interface definitions. 
NewX X X Device: PROCEDURE [<optional parameters>] 
RETURNS [Device]; 


A procedure of this form is provided by each device implementation. 
New X X XI mage: PROCEDURE [<optional parameters>] 
RETURNS [Image]; 


> 


A procedure of this form is provided by each scanned image type. 
New DisplayContext: PROCEDURE [device: Device] 
RETURNS [dc: DisplayContext]; 


Initializes a display context. The transformation T ts initialized to the VCS-to-DCS mapping provided by the 
device. The clipping region CR is initialized to be the boundary of the device. The current position (cpx.cpy) is 
set to 0.0. 


GetCurrentPosition: PROCEDURE [dc: DisplayContext] 
RETURNS [x,y: Real]; 


Returns x.y such that (x.y)7 = (cpx.cpy). 
SetCurrentPosition: PROCEDURE [dc: DisplayContext, 
x,y: Real]; 
Sets (cpx.cpy) to (x.y)T. 
NewTrajectory: PROCEDURE [x,y: Real] 
RETURNS [t: Trajectory]; 
Returns a new trajectory. Every trajectory has a first position (FP) and a /ast position (LP): for a new trajectory. 
both FP and LP are set to (x,y). 


LineTo: PROCEDURE [f: Trajectory, x,y: Real] 
RETURNS [u: Trajectory]; 


Returns a trajectory u that includes. in addition to t. the line segment from t’s LP to (x.y). The LP of u is (x.y). 
CurveTo: PROCEDURE [t: Trajectory, x1,y1,x2,y2,x3,y3: Real] 
RETURNS [u: Trajectory]; 


Returns a trajectory u that includes. in addition to ¢. a cubic curve segment from t’s LP to (x3.y3). The curve 
segment is defined by its four Bezier control points: t’s LP. (x7.y7). (x2.y2), and (x3.y3). The LP of u is 
(x3,y3). | | 


Close: PROCEDURE [f: Trajectory] 
RETURNS [u: Trajectory]; 


Returns a trajectory u that includes. in addition to ft, the line segment from t's LP to t’s FP. The FP and LP of u 
are equal. 


Rectangle: PROCEDURE [x/,y/,xu,yu: Real] 
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RETURNS [t: Trajectory]; 


A convenience function, equivalent to: 
t « NewTrajectory[x/,y/]; 
t ¢ LineTo[t,xu,y/]; 
t ¢ LineToft,xu,yu]; 
t ¢ LineToft,x/,yu]; 
t - Close[t]; 


NewShape: PROCEDURE RETURNS|s: Shape]; 


Returns an empty shape list. 


AddToShape: PROCEDURE [s: Shape, f: Trajectory] 
RETURNS [r: Shape]; 


Returns a Shape r that contains, in addition to the trajectories of s, the trajectory t. 


MakeLineShape: PROCEDURE [brush: Shape, t: Trajectory] 
RETURNS [s: Shape]; 


The locus of each point interior to the brush is computed as the origin of the brush shape is moved along the 
trajectory. The union of all these loci form a set of solid areas. The boundaries of these areas make up a shape. 
It is this shape that is returned by MakeLineShape. Note: the above definition is just that. and does not 
describe how line shapes might really be computed. 


MakeColorSource: PROCEDURE [hue,sat,brightness: Real] 
RETURNS {s: Source]; 


Supplies a Source data structure representing solid ink of the specified color. 


MakelmageSource: PROCEDURE [image: Image] 
RETURNS [s: Source]; 


Supplies a Source data structure representing the specified sampled image. 


DrawShape: PROCEDURE [dc: DisplayContext, 
shape: Shape, source: Source]; 


Maps the shape and source through the transformation 7, clips the shape against the CR. and displays the 
shape with the given source as the ink. 


DrawMask: PROCEDURE [dc: DisplayContext, 
mask: Image, source: Source]; 


Maps the mask, its boundary and the source through the transformation 7, clips the boundary against the CR. 
and displays the resulting clipped mask with the given source as the ink. 


SetClipShape: PROCEDURE [dc: DisplayContext, 
shape: Shape]; | 


Maps the shape through the transformation 7. clips the shape against the CR. and installs the shape as the new 
clipping region. 


DOCUMENTATION GRAPHICS 


40 WARNOCK and WYATT 


Translate: PROCEDURE [dc: DisplayContext, x,y: Real]; 


Builds a transformation matrix M that will translate (0.0) onto (x,y), and sets 7. in the display context, to MT. 
Rotate: PROCEDURE [dc: DisplayContext, ang/e: Real]; 

Builds a rotation matrix M that will rotate (1.0) onto (cos(ang/e).sin(ang/e)), and sets T, in the display context, 
to MT. 

Scale: PROCEDURE [dc: DisplayContext, sx,sy: Real]; 


Builds a transformation matrix M that will scale (1.1) onto (sx,sy), and sets 7. in the display context, to MT. 


Concatenate: PROCEDURE [dc: DisplayContext, m: Matrix]; 


Sets 7. in the display context. to mT. 


GetMatrix: PROCEDURE [dc: DisplayContext] 
RETURNS [t: Matrix]; 


Returns the matrix t such that if M is the matrix that transforms VCS to DCS. thent = TM~!. 


Because characters of fonts are treated like any other shapes in this imaging model, 
routines to display text do not properly belong in the above set of primitive operators. 
However, since applications often use text and characters extensively, convenience routines 
can be provided to make the display of text simple. A typical set would include: 

MakeFont: PROCEDURE [<font name>] RETURNS [f: Fontld]; 
DisplayChar: PROCEDURE [c: Character, f: Fontid]; 
DisplayText: PROCEDURE [s: String, f: Fontld]; 
GetCharMetrics: PROCEDURE [c: Character, f: Fontld] 
RETURNS [<whatever metrics a font provides>]: 


Device Independent Procedures 


The routines that implement the imaging model depend on a large number of ideas and 
algorithms. It is not practical to describe these in detail. Instead the few critical observations 
and algorithms necessary to get the central ideas of the implementation are discussed. 


To best understand how these ideas work together, one needs to understand several 
fundamental operations found in the DrawShape, DrawMask and SetClipShape 
procedures. 


Shape Mapping 
Given a shape (set of closed trajectories) and a transformation matrix M, we will say that 


the shape is mapped using M when all points and cubics that comprise each of the shape’s 
trajectories are transformed using M. 
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Figure 3: Shape mapping. 


Shape Approximation 


Given a shape (set of closed trajectories) in the DCS we will say that the shape is 
approximated when all cubics, in each of the shape’s trajectories, are replaced by piecewise 
linear approximations. Since the shape is in the DCS, it is possible to make piecewise linear 
approximations as a function of device resolution. 


An approximated shape exists as a collection of polygons, which may be mutually 
intersecting, self intersecting and concave. 


Shape Reduction 


An approximated shape is reduced when the polygons that make up the shape are 
converted into a collection of disjoint. convex polygons which tile the interior of the shape. 
The locus of the shape’s interior is determined by applying a wrap number convention. There 
are a number of known tiling algorithms [1.2,3]. 


Figure 4: Shape reduction. 
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Shape Clipping 


An approximated reduced shape will be called clipped if each of its convex polygons has 
been clipped against the set of the convex polygons that make up a clipping region. 


Figure 5: Shape clipping. 


The above concepts are used freely in the process of displaying a shape. Of equal 
importance is the description of the basic operation that is carried out by the device 
dependent part of the system. 


Internal Interfaces 


Just as the application program is a client of the application interface, the device- 
independent portion of the system is a client of the device and source interfaces. 
Information available at the device interface includes: 


l. The transformation matrix that maps the virtual coordinate system to the device 
coordinate system. 


2. The shape (in the device coordinate system) that bounds the display area. 


3. A vector of procedures that implement the scan conversion primitives. These are 
device-specific procedures; their calling sequences do not vary from device type to 
device type, but the way they perform their function may vary dramatically across 
device types. 


The source interface is used both for scanned image sources and for masks. Information 
available at the source interface includes: 


1. The transformation matrix that maps the image or mask coordinate system to the 
virtual coordinate system. 


2. The shape (in the source coordinate system) that bounds the image or mask area. 
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3. A vector of procedures that implement the source accessing primitives. These are 
source-specific procedures. | 


The operations defined by these interfaces are called as required by the device 
independent portions of the system. Some examples of their use are illustrated below. 


Device Dependent Procedures 


Scan Conversion 


Strictly speaking, a given device implementation need supply only one procedure, 
DisplayConvexPolygon. ‘This procedure implements the most general case of scan 
conversion that the device must handle: pushing a general mapped scanned image, as a 
source, through a mapped mask. The arguments taken by the procedure are: 


1. A convex polygon. This polygon represents either part of a shape (if no mask is 
given), or the boundary of a mask (if a mask is given). 


2. Asource, which is either a constant value, or: 
a. A mapping S from the source’s SCS to the DCS, and 
b. A pointer to the source sample array. 

3. An optional mask which includes: 
a. Amapping M from the mask’s SCS to the DCS, and 


b. A pointer to the mask sample array. 
The operation carried out by this procedure is: 


For each pixel position (x,y) in the interior of the convex polygon in DCS. compute (x,.y,) = 
(x,y)S~!, and (4n.¥n) = (x.y)M—*. If the value in the mask array at (x,,.y,,) = 1, then interpret 
the source value at (x,.y,) for the device type and display an appropriate value at (x.y). Note: 
in practice, instead of mapping each point in DCS through two inverse mappings 
(computationally expensive), incremental mapping techniques are used: in this way each 
mapping is replaced with two add operations. This is discussed later in the section on 
optimizations. 


In most implementations of a given device, special cases that are expected to occur frequently 
can be provided as subcases of the above. Two common special cases are listed below. 


l. DisplayRectangularMask. This procedure is a special case of the above where: 
the polygon is rectangular in the DCS, the source is a constant color. and the 
mapping from the SCS to DCS contains only a translation component. This 
procedure 1s typically used for the display of most characters. 
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2. DisplaySimplePolygon. This procedure is a special case of 
DisplayConvexPolygon where the source is constant, and there is no mask. Most 
line drawings and simple filled shapes use this procedure. 


It can be seen how the above device independent and device dependent notions are brought 
together by examining the processing steps associated with the DrawShape, DrawMask and 
SetClipShape procedures. 


DrawShape (source is a constant color) 


1. Map the input shape into the DCS using 7 (the transformation from the display 
context). 


2. Approximate the resulting shape, reduce it, and clip it to CR (the clipping region 
from the display context). 





3. Pass each resulting convex polygon to the DisplayConvexPolygon procedure 
(found in the device interface in the display context), along with the constant color 
value. 


DrawShape (source is a scanned image) 


l. Concatenate the SCS-to-VCS transformation Q (from the scanned image interface) 
with the VCS-to-DCS transformation T (from the display context) to form the SCS-to- 
DCS transformation S$ (S = QT). 


2. Map the shape that bounds the scanned image (from the scanned image interface) 
into the DCS, using S. | 


Approximate the resulting shape, reduce it, and clip it to CR. 
Use the resulting set of convex polygons to define a new clipping region, CR* 


Map the input shape (the argument to DrawShape) into the DCS, using T. 


a et ae 


Approximate the resulting shape, reduce it, and clip it to CR. At this point. the 
resulting convex polygons represent the intersection of the mapped boundary of the 
image and the mapped input shape. 


7. Pass each convex polygon to the DisplayConvexPolygon routine, along with 
transformation S and a pointer to the image samples. 


DrawMask (source isa constant color) 


1. Concatenate the SCS-to-VCS transformation R (from the mask interface) with the 
VCS-to-DCS transformation T to form the SCS-to-DCS transformation M(M = RT). 
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2. Map the bounding shape of the mask into the DCS, using M. 
Approximate the resulting shape, reduce it, and clip it to CR. 


4. Pass each resulting convex polygon to the device’s DisplayConvexPolygon 
procedure, along with the constant color value, transformation M, and a pointer to 
the mask samples. 


DrawMask (source is a scanned image) 


1. Concatenate the SCS-to-VCS transformation Q (from the scanned image interface) 
with the VCS-to-DCS transformation 7 to form the SCS-to-DCS transformation S (S 
= QT). This mapping transforms coordinates from the scanned image to the device. 


Map the bounding shape of the image into the DCS, using S. 
Approximate the resulting shape, reduce it, and clip it to CR. 
Use the resulting set of convex polygons to define a new clipping region, CR*. 


A ek wh 


Concatenate the SCS-to-VCS transformation R (from the mask interface) with the 
VCS-to-DCS transformation T to form the SCS-to-DCS transformation M (M = RT) 
This mapping transforms coordinates from the mask to the device. 

Map the bounding shape of the mask into the DCS, using M. 

Approximate the resulting shape, reduce it, and clip it to CR. 

Pass each resulting convex polygon to the device's DisplayConvexPolygon 


procedure, along with transformations S and M and pointers to the image and mask 
samples. 





SetClipShape 


Map the input shape into the DCS, using T. 
| 2. Approximate the resulting shape, reduce it. and clip it to CR. 








Install the resulting set of convex, non-intersecting polygons as the new CR in the 
display context. 





| Note that the DisplayConvexPolygon routine never needs to do any boundary 
| checking. When a scanned image or mask is present, the shape to be scan converted always 
| represents an interior portion of the image or mask. Also, the clipping region in the display 
context always lies inside the device boundary. 





Optimizations 











Although the above algorithms may involve extensive computation. certain simple 
expected cases can be made to short circuit most of the above machinery without losing any of 
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the generality. Three of these short cuts will now be described. 


The display of characters 


In situations where high performance is required, character fonts are designed to work 
with a given display type, ie, a character is defined to be a mask whose resolution and 
scanning characteristics are the same as the device’s. Also the bounding shape of a character 
mask is a rectangle. Under these circumstances, the mapping taking the mask in SCS to VCS, 
when concatenated with the device’s mapping from VCS to DCS, will be the identity mapping 
(with, perhaps, an application-introduced translation component). Since this is the case, the 
bounding rectangle of the mask will map into a rectangle in the DCS. If in addition the source 
color is solid black and the bounding rectangle is not clipped, the scan conversion process 
reduces to the one-to-one transfer of pixels in one rectangle to pixels in another. To avoid 
checking for the identity transformation for each character, the combined transformation 
from SCS to VCS to DCS can be noted as an identity in the display context. If no operations 
that change transformations (other than translation) take place between the display of 
succesive characters, then no transformation checking need be done. 


If any of the above conditions is not true, then the optimization fails and the more 
general machinery will display the character. 


Transforming sources and masks 


The rotation, scaling and sampling of sources and masks can be computationally 
expensive. To reduce this computation the following steps are taken (the discussion centers on 
the handling of sources, but the same discussion applies to masks). 


Scan conversion, in the device coordinate system (DCS), is carried out pixel-by-pixel 
along successive scan lines. [f the unit vector (1,0) representing the delta vector between 
successive pixels in a scan line is mapped through S~! (the mapping from the device 
coordinate system to the source coordinate system). then the delta vector ds between required 
sample positions in the source is obtained. Successive sample positions in the source that 
correspond to the successive pixels along the scan line can be incrementally computed by 
mapping the beginning of the scan line in the DCs through S~!, and incrementally adding ds 
to obtain the position of the next sample to be used in the scan line. This optimization 
replaces a general point transformation with two additions. 


Bounding boxes 


An additional artifact, a bounding rectangle associated with a shape, can be introduced 
after mapping into the DCS. Bounding rectangles can be used to short cut the amount of 
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computing done by the shape reduction and clipping machinery; ie. if a bounding rectangle 
is exterior to the set of clipping regions, then the associated shape need not be processed 
further. On the other hand, if a bounding rectangle is totally within the clipping region, then 
the shape need not be clipped since it is totally within the region. 


Conclusion 


A consistent, simple, device independent imaging model. and an implementation of the 
model, provide powerful tools to the application programmer. The model presented in this 
paper, presents a general interface that has been useful over a wide range of applications and 
devices. Users have commented positively on the simplicity of the model and its ease of use. 


The aspects of this model that have been most successful are the following: 


Treating text characters like other graphic objects, and not as low-level, device 
dependent primitives has been a big win. High level font dependent abstractions are 
kept in the application programs and not in the low-level device driving programs. 
Furthermore. because of optimization of special cases, no significant performance 
loss results from this generality. 


The generalized clipping facility is valuable. There are simple interactive metaphors 
and complex images that are difficult to build without the clipping facilities. In 
addition to the application advantages, the clipping facilities are used extensively in 
the implementation to avoid bounds testing in the low-level routines. 


With this model an application does not know the particulars of the device, or even 
whether the device is shared. High level display management facilities can allocate a 
portion of the screen, set the clipping region to that portion, and pass the display 
context, representing the entire “device,” to a subapplication. This aspect of device 
independence simplifies systems integration activities. 
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ABSTRACT 


This essay offers a comparison of two modern schemes for controlling what laser 
printers print. One scheme, called PostScript®, is offered by Adobe Systems, Inc.; the 
other scheme, called Interpress®, is offered by the Xerox Corporation. A discussion of 
these two schemes has provoked a certain amount of interest recently. | offer a 
comparative analysis of the two from the point of view of programming language 
issues, system implementation considerations, and graphics imaging requirements. 


To a first order, PostScript and Interpress are quite similar. What | mean by that is that 
by comparison with all other current techniques for page image representation, the two 
can be considered to be nearly identical. | believe that it is worth looking at how they 
got to be that way; their similarities and differences can best be understood with a 
proper historical perspective. 


Part I: History 


The Evans and Sutherland Computer Corporation has for quite a number of years sold 
very expensive, very powerful graphics devices for CAD/CAM and for real-time simulation. 
The CAD/CAM machine is called The Picture System; the simulation machines are custom- 
built for each application. Custom simulation graphics machines are used for such purposes as 
providing the windshield graphics for military flight simulation systems— emulating what a 


“PostScript” is a trademark of Adobe Systems. Inc. “Interpress™ is a trademark of the Xerox Corporation. 
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pilot would see if he were looking out the window of a real airplane. These graphics systems 
use a very clever graphics model, developed by Ivan Sutherland and others, which is based on 
coordinate system transformations and line drawing. 


Although the Evans and Sutherland company is primarily in Salt Lake City, they had a 
small research office in Mountain View (California) in the early 1970’s. John Warnock was in 
charge of it, and John Gaffney worked for Warnock. One of the activities of the Mountain 
View office was to develop software for producing 3-dimensional graphical databases both for 
the Picture System and for the simulation machines. Working with Warnock, Gaffney had by 
1975 programmed and documented and released the first version of a programming language 
that was called “The Evans and Sutherland Design System”. 


Gaffney came to E&S from graduate school at the University of Illinois, where he had 
used the Burroughs B5500 and B6500 computers. Their stack-oriented architectures made a 
big impression on him. He combined the execution semantics of the Burroughs machines with 
the evolving Evans and Sutherland imaging models, to produce the Design System. Like all 
successful software systems, the Design System slowly evolved as it was used, and many 
people contributed to that evolution. | 


John Warnock joined Xerox PARC in 1978 to work for Chuck Geschke. There he 
teamed up with Martin Newell in producing an interpreted graphics system called JAM. 
“JAM” stands for “John And Martin”. JAM had the same postfix execution semantics as 
Gaffney’s Design System, and was based on the Evans and Sutherland imaging model, but 
augmented the E&S imaging model by providing a much more extensive set of graphics 
primitives. Like the later versions of the Design System, JAM was “token based” rather than 
“command line based”, which means that the JAM interpreter reads a stream of input tokens 
and processes each token completely before moving to the next. Newell and Warnock 
implemented JAM on various Xerox workstations; by 1981 JAM was available at Stanford on 
the Xerox Alto computers, where I first saw it. 


In the meantime, various people at Xerox were building a series of experimental raster 
printers. The first of these was called XGP, the Xerox Graphics Printer, and had a resolution 
of 192 dots to the inch. Xerox made XGP’s available to certain universities, and by 1972 they 
were in use at Carnegie- Mellon. Stanford, MIT, Caltech, and the University of Toronto. Each 
of those organizations produced its own hardware and software interfaces. The XGP is 
historically interesting only because it is the first raster printer to gain substantial use by 
computer scientists, and was the arena in which a lot of mistakes were made and a lot of 
lessons learned. 


To replace the XGP, Xerox PARC developed a new printer called EARS. and then 
another newer printer called Dover. After the agony of converting software from XGP to 
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EARS, various Xerox people realized that applications programs generating files for the XGP 
or for EARS should not be tied to the device properties of the printer itself. Bob Sproull and 
William Newman, of Xerox PARC, developed a relatively device-independent page image 
| description scheme, called “Press format”, which was used to instruct raster printers what to 
| print. 





| As part of an extensive grant program to selected universities, Xerox donated Dover 
| printers and made documentation of the Press format available under a nondisclosure 
| agreement. As far as I know, that nondisclosure agreement has never been lifted, though 
information about Press format has been widely enough distributed that by 1982 researchers 
at the Swiss Federal Institute of Technology (EPFL) at Lausanne had given conference papers 
about their own independent implementation of Press format. 




















| Press format was a smashing success; it revolutionized laser printing technology in the 
academic and research communities, and stimulated a large number of people to think about 
| issues of device-independent print graphics. Nevertheless, Press format had its limitations, 
; and various people felt the need to revise the basic design. 
| 


Sproull left Xerox in 1978 to become a professor of computer science at CMU. Newman 
| returned home to England to become an independent consultant. Martin Newell left Xerox to 
| join Cadlinc Corp. Warnock and Geschke remained at Xerox. 





While at CMU, Sproull began making plans for a new version of Press that would 
combine the graphics model of JAM with the page image description properties of Press. 
Sproull returned to Xerox for a sabbatical leave in 1982, and enlisted the help of Butler 
Lampson in the creation of the new page image description language that Warnock dubbed 
“Interpress”. The name caught on. 


While it is difficult to separate the contributions made by Sproull and Lampson, it is not 
incorrect to say that Lampson and Warnock produced the execution model of Interpress 
while Sproull and Warnock produced the imaging model. It is also approximately correct to 
characterize this first version of Interpress as being derived from the graphics model and 
execution model of JAM with additional protection and security mechanisms derived from 
experience with programming languages like Euclid and Cedar, and a careful silence on the 
issue of fonts. The trio worked under Geschke’s direction, and Geschke was responsible for 
refereeing disagreements and for making certain that the resulting design was acceptable to 
the rest of Xerox. 

















My own involvement with the Interpress effort is difficult to explain. Sproull was my 
thesis adviser. At CMU, we had discussed many of the issues in page description languages at 
length. As a consultant to PARC during the Interpress design work, my primary activity was 
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one of writing or rewriting the Interpress materials. | also represented a “consumer” point of 
view rather than a “designer” point of view, and often complained about aspects of the 
evolving language. 


I feel uncomfortable discussing the issues involved in the transition of Interpress from an 
artifact of the research lab to a marketable product. [ shall therefore not discuss them. During 
this transition phase Geschke and Warnock left PARC (December 1982) to start Adobe 
Systems, Sproull returned to CMU (June 1983), and Lampson left PARC to join DEC 
Research (November 1983). 


Warnock had various philosophical differences with the final Interpress design, and he 
voiced those differences to the rest of the Interpress group at every opportunity. At Adobe, 
Geschke and Warnock saw the opportunity to try again, with a design group composed of 
people who shared his ideology. They enlisted Doug Brotz, a Xerox PARC researcher who 
had had no involvement with any of the Press/JAM/Interpress world, to join them in 
developing a new page description language named PostScript, based on combining the 
execution model and imaging model of JAM with a protection structure more reminiscent of 
C or the Unix shell than of Euclid or Cedar. While not at all a copy of JAM, PostScript 
resembles JAM more than it resembles Interpress. PostScript also embraced various Unix 
notions, such as the use of text streams to convey information. 


On March 15, 1984, Adobe shipped its first PostScript manual to a potential customer. 
That PostScript manual was printed on a PostScript printer using a Times Roman font 
licensed from Allied corporation and digitized by Adobe. 


At that time all aspects of the Interpress project were still very proprietary, and it 
appeared to me that Xerox had no interest in releasing them. However, on April 25, 1984, I 
received a Xerox press release announcing the availability of Interpress documentation. | 
finally managed to get my hands on a copy of the Interpress documentation in February of 
1985, and was quite surprised to discover that the Interpress documentation had not been 
printed on an Interpress printer, but was instead printed on a Press format printer, using the 
same Times-like and Helvetica-like fonts that I had become familiar with at CMU and 
Stanford on the Dover printers. 


Part II: Comparison 


Part I outlined the history of PostScript and of Interpress, as | have been able to 
determine it. With that historical background. I now offer a comparison of the two languages. 
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While there are quite a number of extant schemes for the description of printed images, 
most of them are better described as “data structures” than as “languages’’. In particular, only 
PostScript and Interpress are directly executable. 


Languages can be compared at several different levels. Languages have a lexical 
representation, a syntax, a semantic model, an intended style of usage, and implementation 
considerations. 


Lexical Considerations 


The lexical properties of a language define the way the tokens of the language are 
represented in terms of bits, bytes, or characters. The FORTRAN language was defined in 
terms of a particular character set, which the implementor was expected to use. The ALGOL 
language was defined in terms of keywords and symbols, and the language definition left the 
implementor free to choose how he would represent those keywords in terms of characters 
available on his computer. For example. the FORTRAN definition of a “DIMENSION” 
Statement is that it is the letter “D” followed by the letter “I” followed by the letter ““M”, etc. 
The ALGOL definition of the “BEGIN” keyword was merely that it was a keyword; the 
ALGOL standard document used boldface to identify keywords. When ALGOL is 
implemented on computers whose character sets include boldface, the implementors normally 
use the boldface characters as a way of identifying keywords. When ALGOL is implemented 
on other computers, the implementors choose other schemes for identifying keywords. such as 
putting them in quotes or putting them in all capital letters. 


Both PostScript and Interpress have an operator called MOVETO, and in both languages 
it does exactly the same thing, which is identical to what the MOVETO operator did on the 
Evans and Sutherland hardware that spawned this graphics model. Let's look at how that 
operator would be represented in the two languages. 


The PostScript language is defined in terms of characters. like FORTRAN. The 
definition of the PostScript operator “"MOVETO?" is the letter “M™ followed by the letter “O” 
followed by the letter “V", etc. The Interpress language is defined in terms of keywords; the 
definition of the Interpress operator “MOVETO” is that it is a keyword in the ALGOL sense. 
The Interpress 2.1 standard suggests that MOVETO can be represented with the serial 
number 25 in a standard encoding that the standard provides. but the definition of the 
MOVETO keyword is independent of the choice of encoding. 


Since PostScript is defined in terms of sequences of characters. it is always possible to 
assume that a PostScript file can be transmitted over any link capable of sending characters. 
and can be stored in any device capable of holding characters. Since Interpress is defined 
more abstractly. it is not necessarily possible to make any assumptions at all about a particular 
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Interpress file. However, any Interpress encoding can be translated into any other Interpress 
encoding, so it is always possible to take an Interpress file and translate it into a stream of 
characters which will then have properties identical to PostScript’s. Conversely, it is always 
possible to translate a PostScript program into a tokenized keyword form, though the 
PostScript standard does not suggest any particular tokenization scheme. 


It is worth mentioning that the word “token” is slightly overloaded here. A “tokenization 
scheme” is a means of doing data compression, wherein a sequence of characters is called a 
“token” and is replaced by a token number, which will occupy less space. However, a 
language can have tokens without having a tokenization scheme. Both PostScript and 
Interpress have an execution semantics that is defined in terms of things called “tokens”. The 
Interpress tokens are normally represented by tokenization schemes—i.e. replaced with 
integers ~ while the PostScript tokens are normally left as sequences of characters. In later 
sections of this message the word “token” will be used to mean either the PostScript kind of 
token or the Interpress kind of token; by the time they get to the interpreter they are roughly 
the same thing. 


The Interpress 2.1 standard defines a particular encoding of [nterpress, and gives bit and 
byte formats, decimal integer operator numbers, and so forth. This encoding is a full binary 
encoding, using all 8 bits of each byte, which means that it cannot always be sent over a serial 
character link. The Interpress standard encoding of a page description normally occupies a 
smaller number of bytes than the equivalent PostScript character representation. This is 
possible because binary encodings make more efficient use of the bits. 


Interpress files are clearly intended to be transmitted via XNS protocols over Ethernet. 
In its current form, without further processing or re-encoding, Interpress is not suitable for 
transmission over character-protocol lines. PostScript files are clearly intended to be 
transmitted over character-protocol lines. Like all character stream protocols, PostScript can 
also be transmitted over Ethernet, but a PostScript file will use more bytes than the 
corresponding Interpress file. 


Text files such as PostScript sources are highly redundant (i.e. they make inefficient use 
of their bits) and can be run through data compression programs (such as the Unix “compact” 
program) to reduce the amount of space they occupy in storage and during transfer. Data 
compression techniques will probably not yield much further compression of Interpress files, 
because the information is already quite tightly packed. After compression of both, the 
PostScript and Interpress representations of an image will likely occupy approximately the 
same number of bits. As an example. the PostScript file representing this article occupies 
113088 bytes; a data compression program reduced it to 45331 bytes, which is a 60% reduction 
(40% of the original size). 
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Syntactic Considerations 


The syntactic issues (or issues of syntax, if you will) of a language are the means by which 
an interpreter for the language distinguishes variables from operators from constants from 
function calls from quoted strings, and by which it determines whether or not a certain 
sequence of characters or tokens is in fact a “legal” construct in the language. 


As languages in general go, both PostScript and Interpress are remarkably free of syntax. 
As token-oriented postfix languages, each token of the language is “executed” as soon as it is 
identified, and that execution will either succeed or fail depending on the state of the 
execution environment at that point. 


Nevertheless, both languages have a small amount of syntax, though they differ radically 
in the nature and application of this syntax. In fact, the primary area in which the PostScript 
language and the Interpress language are incontrovertibly and irrevocably different is in their 
syntax. 


As explained above (Lexical Issues) PostScript is defined in terms of character sequences. 
A PostScript program is a series of character tokens, separated by white space characters. That 
program is fed to an interpreter to be executed: the interpreter reads in the characters and 
assembles them into words (i.e. tokens), then looks up the tokens in dictionaries to determine 
their meaning. In this regard PostScript is similar to many other programming or command 
languages: if the PostScript interpreter sees the command “MOVETO’, it finds the current 
definition of that string, and then performs whatever action is requested in that definition. 


By contrast, Interpress is defined in terms of byte codes, which behave more like the 
instruction codes of a hardware interpreter than like a traditional programming language. 
Instead of the letters “MOVETO”, an Interpress file will have a byte whose binary value is 25: 
the number 25 is then used to index an operation code table which directs the interpreter to 
the program implementing the MOVETO operation. 


The byte codes of Interpress can be viewed as a compiled form of the character codes of 
PostScript. One could imagine a translator that passed over a PostScript file, looked up each 
name, and produced an output file whose contents was the binary identification of the thing 
found during the lookup. In fact. the Interpress standard document explains that the two 
forms are equivalent, and the Introduction to Interpress document explains how to write a 
program to convert one to another. 


There is, however, a crucial difference between the PostScript and Interpress naming 
schemes that makes them very different. and makes impossible the above-mentioned 
imagined compiler to translate PostScript into Interpress. That difference is best understood 
as a semantic difference, and will be explained in the next section. 
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Returning to syntactic issues, an Interpress file has what is called “static structure” or 
“lexical structure”. This means that you can look at an Interpress file and make structural 
assumptions about what you find there. For example, an Interpress file is defined to be a 
sequence of “bodies”; each body is a sequence of operators and operands. The first body is 
the “preamble”, or setup code; all following bodies correspond to printed pages. If an 
Interpress file has 11 bodies, then it will print as 10 pages. 


By contrast, a PostScript file has no fixed lexical structure; it is just a stream of tokens to 
be processed by the interpreter. PostScript prints a page whenever the SHOWPAGE operator 
is executed. If a PostScript file contains a loop from 1 to 10, with a SHOWPAGE operator 
inside the loop, then it will print 10 pages even though there is only one actual call to 
SHOWPAGE in the file. However, since PostScript is a textual language, and since it has a 
“comment” facility like the C /*....*/ or Pascal {...}, it is possible for the creator of a 
PostScript file to represent whatever additional information is desired. It is a slight misnomer 
to call this a comment facility, because the normal use of the word “comment” in 
programming languages implies that the contents of the comment are irrelevant. PostScript 
comments are irrelevant in the sense that they do not affect the image produced by a 
PostScript file, but they do convey machine-readable information about the structure of the 
document. 


A PostScript client is free to choose any structuring scheme that he wants, and the tool 
that he has available to implement this structuring scheme is the PostScript comment. There is 
a particular “standard” structuring convention documented along with PostScript by which 
page boundaries and other lexical information can be marked. A PostScript file that follows 
that convention is called a “conforming” file, but it is a convention and not a rule; the printed 
image produced by a nonconforming PostScript file will be identical to that produced by the 
equivalent conforming PostScript file. Conversely. the structure of a PostScript file, as 
represented by the structuring convention, is completely independent of the appearance of 
the page images—the actual PostScript text appears to be a series of comments as far as the 
Structuring systems are concerned. 


The technique of mixing two different languages in one file, so that a processor for one 
language sees the text of the other language as comments, is not new. Perhaps the most 
widely-known instance of this scheme is Don Knuth’s “WEB” system, in which Pascal and 
TEX are woven together in such a way that the Pascal program looks like a comment to the 
TEX interpreter and the TEX source looks like a comment to the Pascal compiler. 


This absence of fixed lexical structure in PostScript is a two-edged sword. On the one 
hand, it offers more flexibility in creating page images, especially repetitive ones: on the other 
hand. it provides more opportunities to make mistakes. 
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One final syntactic issue is perhaps worth mentioning, though it could also be considered 
a semantic issue. Interpress does not support “variables” so much as it supports “registers”, in 
the hardware sense. All storage in Interpress is accessed by address and not by name. What 
would be called a “local variable” in a programming language is represented in Interpress by 
an integer subscript into the procedure’s frame. All programming languages must ultimately 
reduce their variable names into memory locations; Interpress asks that this translation be 
performed by the creator of the Interpress file and not by the interpreter. An obvious benefit 
of this approach is efficilency—no name lookups need be performed as the file is being 
printed. An obvious drawback of this approach is the restricted name space available to the 
programmer and the extra care that must be taken to manage addresses instead of names. By 
contrast, PostScript supports ordinary named variables. 





Semantics 


Since both Interpress and Postscript derive their semantics from the same source, it 

stands to reason that the semantics would be similar. Both use similar graphical semantics, the 

- game imaging model, and both use very similar execution semantics. The differences are 
minor, though one could imagine that the consequences of those differences might be major. 





There are two substantive differences between the graphical semantics of PostScript and 
Interpress 2.1, namely that Interpress has no facility for describing curves, and the Interpress 
standard is completely silent on the issue of fonts. 


A curve can of course be approximated with a series of line segments. and if the line 
segments are short enough the resulting appearance will be identical, but many classes of 
curved lines, such as those appearing in fonts, can be described very succinctly in terms of the 
PostScript CURVETO operator while requiring a tedious collection of short line segments to 
describe in Interpress. Also, the choice of line segment length must be determined as a 
function of the printer resolution, which means that Interpress, which does not support curves 
directly in the language, cannot reliably make curves in a device-independent fashion. 





On the issue of fonts, the Interpress standard states only that a font’is an operator that 
will be executed for you when appropriate, and that the operators for that font are defined “in 
the Environment”. A PostScript font is just an ordinary PostScript defined operator. and the 
PostScript manual gives explicit instructions for creating user-defined fonts and making those 
font definitions be part of a PostScript file. One could imagine that it is possible to write an 
Interpress composed operator (in Interpress. of course) to behave like a user-defined font. but 
the Interpress implementations do not currently have any mechanism for recognizing that an 
| operator is in fact a user-defined font and should therefore receive any kind of special 
treatment. This is not a deficiency in Interpress, just a silence. accompanied by a deficiency in 
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current implementations (this and other implementation issues are discussed in the last 
section). 


There are three consequential differences between PostScript execution semantics and 
Interpress execution semantics: user-defined operators, the nature of the “firewalls” between 
pieces of the program, and error recovery. 


In Interpress, a user-defined operator is syntactically different from an intrinsic operator, 
and requires an explicit “DO” operator to call it. In PostScript a user-defined operator is 
syntactically identical to an intrinsic operator, and in fact any intrinsic operation can be 
redefined by simply making a new entry for that operator's name in the appropriate 
dictionary. This is stylistically similar to the difference in lexical structure: Interpress 
guarantees that if a byte code 25 — the MOVETO operator — is found in a file, that it will when 
executed perform a standard MOVETO. PostScript permits the redefinition of the name of 
any operator. If you want to redefine the meaning of MOVETO, then you can do so, and 
when the characters °"M O V ET O” are found in a PostScript file, the redefined operator will 
be executed instead. To execute a PostScript user-defined operator you just include its name, 
the same way you execute any other operator. To execute an Interpress user-defined operator, 
you execute the DO operator (or a variation of it), after pushing onto the stack the thing that 
you want to execute. 


Analogously with the static structural issues, The PostScript user-defined-operator 
scheme offers more flexibility than Interpress but carries with it more dangers. Like the old 
saw about giving one enough rope to hang himself, the additional flexibility of the PostScript 
scheme requires discipline on the part of the user. Furthermore, just as PostScript has a 
convention for the voluntary inclusion of static structure in a file, it has a mechanism by 
which a PostScript program can reference the true built-in version of an operator and not the 
current, possibly user-redefined, version of an operator. From the point of view of language 
design, this scheme is not terribly elegant, but it is quite practical, as it provides a mechanism 
for the solution of all of the problems associated with operator redefinition and the 
prevention thereof. 


It is this ability to redefine builtin operators that makes the compilation of a textual 
Postscript file into an encoded Interpress file (mentioned above under Syntax) impossible. A 
static analysis cannot determine the operator that will be executed when the textual token ts 
interpreted. By contrast, it is easy to translate Interpress into PostScript, because all of 
Interpress’ semantic capabilities have direct equivalents in PostScript, and the lexical 
translation is straightforward. 


Interpress has a distinction between “bodies” and “operators”. A “body” is a sequence of 
Interpress tokens. The [nterpress operator “MAKESIMPLECO” (make simple composed 
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operator) translates a body into an operator. Like all other Interpress operators that reference 
bodies— referred to in the Interpress standard as “body operators” — the MAKESIMPLECO 
operator is prefix and not postfix. This was done to make it easier for small computers to 
implement [nterpress interpreters; it has the interesting side-effect of making it impossible for 
an Interpress program to generate and then execute a piece of Interpress source code. | would 
| guess that the entire reason for the distinction between Interpress bodies and operators is to 
| enable a clean prefix implementation of body operators while at the same time permitting the 
more conventional postfix use of expressions of type “operator”. 





By contrast, PostScript represents operator bodies as arrays of PostScript tokens. The 
PostScript lexical scanner processes a body by building an array out of the tokens that it finds 
| in the input stream; that body is then handled as an ordinary data value in the language, and 
it can be stored into variables, executed, modified, searched or searched for, etc. The 
| translation of a body into something like an Interpress operator consists merely of returning 
the address where the body is stored; that can be handled by the PostScript type system and 
does not require a special conversion operator. Consequently, a PostScript program is able to 
generate an array of PostScript operators, however it so chooses, and then declare that array to 
be a new PostScript operator and have it be executed just like any other PostScript operator. 





The second important semantic difference between PostScript and Interpress is the set of 
mechanisms that they offer for protecting one piece of the file from side effects in another. As 
you might be able to guess if you have read this far, the Interpress protection mechanism is 
Static and mandatory while the PostScript protection mechanism is dynamic and optional. 
This kind of mechanism is often referred to as a “firewall”. 





| An Interpress file consists of a series of bodies. Each body is executed completely 
| independently of each other body. In particular, at the beginning of each page body, the 
execution environment is restored to the state that it had at the end of execution of the 
preamble, so that each page body is executed as if it were the only page in the document. 
| There is absolutely nothing that the code in one I[nterpress page can do that will have any 

effect on the execution of the code in any other Interpress page. and the Interpress language 
| guarantees that independence. This permits, for example, the pages to be executed or printed 
in any order, front to back or back to front, or in folios of 16 pages at a time. with complete 
confidence that the appearance of the pages will not change. 


By contrast. a PostScript file has no static structure. so there is no convenient place to 
build automatic firewalls. PostScript provides. instead, two pairs of operators by which a 
PostScript user can build his own firewalls wherever he wants them. There is an operator 
called SAVE, and another operator called RESTORE. The RESTORE operator restores the 
execution state of the machine back to what it was when the last SAVE operator was executed. 
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Thus, if a PostScript user wants to have pages that are firewalled against each other, then he 
puts a SAVE operator at the beginning of the page and a RESTORE operator at the end of 
the page. If the PostScript user wants to play tricks, and build PostScript files that do bizarre 
things with the execution state between pages, he is free to do so by leaving out the SAVE and 
RESTORE. 





By now you can probably see the fundamental philosophical difference between 
PostScript and Interpress. [nterpress takes the stance that the language system must guarantee 
certain useful properties, while PostScript takes the stance that the language system must 
provide the user with the means to achieve those properties if he wants them. With very few 
exceptions, both languages provide the same facilities, but in Interpress the protection 
mechanisms are mandatory and in PostScript they are optional. Debates over the relative 
merits of mandatory and optional protection systems have raged for years not only in the 
programming language community but also among owners of motorcycle helmets. While the 
Interpress language mandates a particular organization, the PostScript language provides the 
tools (structuring conventions and SAVE/RESTORE) to duplicate that organization exactly, 
with all of the attendant benefits. However, the PostScript user need not employ those tools. 


Before taking a stand on this issue, you must remember that neither [nterpress nor 
PostScript is engineered to be a general-purpose programming language, but rather to be a 
scheme for the description of page images, so it is not necessarily valid to apply programming 
language lore to these two systems. 


The third area in which there are significant semantic differences between PostScript and 
Interpress is in error handling and error recovery. The Interpress 2.1 standard is slightly vague 
as to what happens when various error conditions occur; one assumes that when an 
implementation of Interpress becomes available, some reasonable error-handling behavior 
will be added to the language. The PostScript language provides a user-extensible error- 
recovery mechanism that is keyed on PostScript’s ability to redefine intrinsic operators. 
Whenever an error of any kind occurs in PostScript, be it the printer out of paper, the file 
asking for a font that doesn’t exist, or a division by zero, the PostScript interpreter responds 
by executing an “error operator’. If the error operator has not been redefined, then some 
standard action is taken; sometimes the standard action is to do nothing, while sometimes the 
standard action is to abort or to retry. The standard action is merely the execution of the error 
operator. 


The Interpress documentation does not offer much explanation, one way or another, of 
error handling. The Interpress standard describes certain kinds of error conditions that can 
occur, such as “appearance error” or “master error”. but does not specify exactly what will 
happen if those errors occur. I assume that the reason the standard is vague is to provide 
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leeway to the implementors in error handling. The Interpress language standard does not 
describe any technique by which an Interpress master can control or modify the error 
recovery actions. 


When a PostScript error occurs, an error operator is executed. There is a set of built-in 
error operators provided as part of PostScript, and documented like all other operators. If a 
PostScript user wants to change the error handling of a PostScript printer, he simply changes 
the dictionary entry for the relevant error operator. Depending on the relative position of that 
redefinition with respect to SAVE and RESTORE operators in the PostScript file, the 
redefinition will have a certain lifetime. A SAVE and RESTORE pair is wrapped around each 
separate file printed by a PostScript printer, so that the redefinition does not carry over to 
other jobs. The manager of an installation can change the overall default of the printer by 
sending it a redefinition, during printer startup, before entering the SAVE/RESTORE loop 
around each print job. 


Like so much of PostScript’s flexibility, the ability to redefine operators is a two-edged 
sword. Redefining an operator can be used to advantage by clever and knowledgeable users, 
and it can be used as a technique for fixing bugs in a PostScript implementation. For example, 
if an accounting package were not provided as part of a PostScript implementation, the 
owners of a PostScript printer could add page accounting to their printer by downloading a 
redefinition of the SHOWPAGE operator that kept accounting information. However, a user 
might be able to disable that accounting by doing yet another redefinition that disabled the 
installation’s accounting. To circumvent this class of problem, PostScript provides a 
mechanism for declaring certain objects to be read-only, or execute-only. The management of 
a Shared PostScript printer can specify that part of its power-up or restart sequence is to load a 
configuration file; that configuration file can redefine certain operators — for the purpose of 
bug fixing or accounting or any other reason—and then, if desired, mark the redefined 
operators read-only so that they cannot be further redefined. As a language mechanism this is 
very clumsy, but as an operational technique it is effective. 


System Issues 


The ultimate purpose of a page representation language. such as PostScript or Interpress, 
is to Store page images, and to convey them to a printer when they need to be printed. The 
relevant “system issues” are those issues that affect how well. and how easily. a page 
representation language works for this purpose. 


The most obvious difference between PostScript and Interpress is the tight coupling 
between Interpress and the XNS network system. as compared to the loose coupling between 
PostScript and any particular brand of computer or network. An Interpress file is a structured 


DOCUMENTATION GRAPHICS 





62 REID 


binary file, which must be transmitted over networks that will not disturb its contents. A 
PostScript file is a text file, which can be sent over any communications medium supporting 
alphanumeric characters. In that regard, the PostScript file is more universal. 


There is no particular difficulty in using a page description language to build a page 
image on a computer, then ship it to a nearby printer and print it immediately. The computer 
can be configured to have specific knowledge of the nature of the printer, and can customize 
the print file for that printer. 


The difficulties arise when the printer is not closely coupled to the computer generating 
the image description file. Perhaps the file is being broadcast, or stored in a database for later 
retrieval by unknown persons, or transmitted to an unknown recipient over a long-distance 
computer network. Page image files that are to be sent over long distances or stored for long 
times must be constructed more carefully. If the printer is not guaranteed to contain the fonts 
referenced by the document, then the document must contain its own fonts. If the printer 
does not have the same resolution as raster fonts packaged with the document, then the 
document will not print. 


The Interpress 2.1 standard is very vague about the factors that will make an Interpress 
file useful over long distances and times. There is no explicit mention of how font access 
works. There is a rigid hierarchical font naming scheme, but no explicit provision for 
including a font with a document. The Interpress standard document defines a number of 
language subsets; this means that to be maximally conservative, an Interpress document must 
be prepared to use the smallest possible Interpress subset, in order that it can print on the 
largest number of printers. This argues that an Interpress file created for storage or 
publication should not use any of the advanced features of Interpress, because the destination 
printers might be subset printers. 


By contrast, PostScript treats fonts as part of the language, with attention given to custom 
and user-defined fonts. A PostScript font name is just a character string, like every other name 
in the language. All PostScript printers support the entire language, which means that a 
document prepared for any PostScript printer will print on any other PostScript printer, 
provided that either the destination printer or the document contains the requisite fonts. 


The final system issue worth some scrutiny is the issue of after-the-fact editing of files 
that for some reason do not print as intended. If for some reason an Interpress file is retrieved 
from a database and found not to be printable on a certain printer, the file can be repaired 
only by specialized software that can read, understand. and update the Interpress format. By 
contrast. if a PostScript file is retrieved from a database and found not to be printable. it can 
be repaired by editing it with any text editor. 
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In summary of the system issues, PostScript is clearly superior to Interpress for the 
purpose of long-distance transmission or long-time storage of printed images. Interpress’ rigid 
structure and binary representation make it appealing for use in a tightly-controlled local 
network system, but PostScript’s looser structure and textual representation make it more 
universal. Furthermore, the notion of Interpress “subset implementations” is a huge 
impediment to transmission or storage/retrieval; every PostScript printer implements the 
entire language, making portability, storage, and transmission much more realistic. 


Implementation Issues 


The implementation considerations are the most difficult to review and compare, because 
it is next to impossible to determine the reason for some annoying property of an 
implementation; it is also not entirely proper to criticize a language for the state of its 
implementation. Nevertheless, the history of programming languages has repeatedly shown 
that good implementations of languages have longer-lasting impact than good designs. 
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ABSTRACT 


An important goal of document preparation systems is that they be device- 
independent, which is to say that their output can be produced on a variety of printing 
devices. One way of achieving that goal is to devise a device-independent page 
description language, which can describe precisely the appearance of a formatted 
page, and to produce software that prints the required image on each variety of printer. 
Most attempts at device-independent page description languages have failed, resulting 
either in schemes that are only partially device-independent or in proclamations from 
researchers that device independence is a bad idea [2, 4]. 


A new generation of procedural page description languages promises a solution. The 
PostScript language, and to a slightly lesser extent the Interpress language, offers a 
means of describing a printed page with an executable program; the page is printed by 
loading the program into the printer and running it. | 


1. Page Description Languages 


An imaging device, such as a typesetter. laser printer. or display. must have some way of 
knowing what image it is being asked to show. The two traditional means of providing it with 
that information have been to describe the image to the imager in terms of a bitmap or 
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character map or describe the image to the imager by means of a sequence of control 
commands to the imager’s electronics. 


The bit-map or character-map schemes are the simplest and oldest. For example a line 
printer is provided with a character map (in this spot put the character X; in this spot put the 
character Y, and so forth). A CRT screen normally has a corresponding memory buffer such 
that each bit on the screen is tied to one bit in memory, and the screen pixel can be made light 
or dark by turning the bit on or off. Mapping schemes require that the dimensions and 
spacing of the characters or bits be identical to what the image creator wanted, or the resulting 
image will be rotated or scaled, perhaps even anamorphically. For example, the pixels on the 
IBM Personal Computer screen are rectangular rather than square, so that an image that was 
specified as an evenly-spaced bitmap will appear to be vertically elongated when displayed on 
its screen. 


Schemes to describe images via commands to the controllers that generate the image are 
potentially more device-independent. For example, if the image is to consist of a horizontal 
line, then the image description can consist of the commands to move a pen to one end of the 
line and then swing it to the other end of the line, without needing to know the device 
resolution or how many pixels must be turned on between the endpoints in order to draw the 
line. Examples of command-stream image description include pen plotters, daisy-wheel 
printers, and laser printers made by Imagen [7]. Almost all current command-stream image 
description languages are derivatives of the XCRIBL system done in 1972 at Carnegie-Mellon 


[5}. 


Bitmap image descriptions take an enormous amount of storage space, are not device- 
independent, and require the program generating the images to have access to character font 
information so that the proper bits can be set. Furthermore, it is extremely difficult to edit or 
modify a bitmap image description, as for example to change a spelling error in a formed 


image, or to remove a component of the image and replace it by another: bitmap image . 


descriptions are not suitable for further processing. 


By contrast, command-stream image descriptions do not always have the capability of 
describing every possible image. For example, if the controller has no command for rotating 
the page or rotating a text character, then it is impossible to describe an image that includes a 
rotated character. The allure of bitmap image descriptions ts that they are universal; the allure 
of command-stream image descriptions is that they are more compact, more editable, and 
somewhat device-independent. 


Clearly an ideal image description scheme will share the universality of bitmap 
descriptions with the device-independence and compactness of command-stream image 
descriptions. 
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2. Procedural Page Description Languages 


A procedural page description is a program, written in some graphics programming 
language, that, when executed, will create the intended page image. The idea is attributed to 
Warnock and Sproull, who devised it as the basis for the Interpress page description language 


[3}. 


Command-stream page description languages would appear to be “procedural” and in 
fact the descriptions that are written in these language are definitely procedures in the 
ordinary meaning of the word: “move the cursor to [12,354]. Switch to boldface. Draw a ‘Q’. 
Move right 9 units.” The true power of a procedural page description language, however, 
comes from the ability to write conditionals, to define and call functions, and to perform 
arbitrary computations based on the value of variables stored inside the printer. I reserve the 
term “procedural page description” for languages with these properties. The ability to 
redefine built-in functions is valuable but not necessary. 


3. Comparing procedural and nonprocedural page descriptions 


Procedural descriptions are often more compact than nonprocedural descriptions of the 
same image, for they can take advantage of regularities in the image. For example, consider a 
procedural description of a piece of graph paper or a geometric grid. It can define a procedure 
to draw a line, then call it repeatedly in an appropriate loop. A nonprocedural description of 
the same image, by comparison, must have a Separate item describing each line in the grid. 


Procedural descriptions allow the use of abstraction and modular construction in image 
assembly. One can assemble a library of procedures that draw commonly-used images, and 
call them inside larger diagrams. Naturally the ability to achieve modularity is not a guarantee 
that image representations will be modular, any more than a structured programming 
language like Modula-2 will guarantee that the programs written in it are well-structured. 


Procedural descriptions can mimic any other page-description language, simply by 
programming subroutines in the procedural description language that duplicate the effect of 
the commands in another language. 


Procedural descriptions can be device-adaptive as well as device-independent. by 
delaying certain decisions about the appearance of he image until the specifics of the printing 
device are known. For example. see Figure 1. which shows four instances of the Stanford 
University logo in 25-point through 50-point size. 
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Figure 1: Adaptive specification: increasing detail with size 


All four of these logotypes are generated from the same procedural definition; notice that 
the detail in the outermost ring, the detail in the trunk of the tree, and the spacing between 
the two innermost rings changes as a function of the physical size of the logo. 


It is also worth noting that procedural descriptions can be abused more easily than 
nonprocedural descriptions: it is possible to write bad code in any programming language, but 
there is often only one workable way to describe an image in a nonprocedural description 
scheme. PostScript printers must be on guard for infinite loops in the pictures they are 
printing. 


4. PostScript and Interpress 


There are two extant procedural page description languages, namely the aforementioned 
Interpress and PostScript [1]. A brief discussion of their relative histories can be found in the 
preface of the PostScript reference manual; a more detailed explanation is given by Reid [l, 
6]. 


Both PostScript and Interpress assume that the printer contains an interpreter for the 
executable language, and that a page is printed by executing the page description program on 
the printer; the image is constructed as a side-effect of the program execution. 


PostScript is more interesting because a complete implementation is readily available, 
and because a language description is widely available. The implementation of Interpress is 
much more limited and is not widely available; the Interpress documentation can only be 
obtained by special order from Xerox [3]. 


Therefore I shall take examples from PostScript. PostScript can do anything [nterpress 
can do; the reverse is not true, as Interpress has a certain number of limitations not present in 
PostScript [6]. In general, however, the explanations and examples to follow comment on both 
Interpress and PostScript. 
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re de 


while input remains do 
begin 
token : = nextLexeme(input); 
lexType : = lexicalType(token); 
if lexType = name then 
begin 
tokenvalue : = lookup(token); 
tokentype : = type(tokenvalue); 
if executable(tokentype) then 
execute(tokenvalue) 
else 
push(tokenvalue) 
end; 
| else push(token) 
| end 





end 


Figure 2: PostScript semantics: outline of the interpreter 








| 5. PostScript Language Details 


| A PostScript image description is a sequence of lexical tokens. Those tokens can be 
| names, numbers, delimited strings. procedure bodies. array bodies. or comments. The tokens 
| are delimited by white-space characters, and by certain other delimiter characters when a 
token boundary can be determined unambiguously. 


When a PostScript program is presented to a printer. it is executed. That execution takes 
place on a stack machine, with names stored in dictionaries. and graphics state stored in global 
variables. This stack semantics makes PostScript operators be postfix. hence the name. Figure 
2 shows an approximation of the PostScript interpreter. Each page begins completely white. 
| ~ Execution of imaging operators causes ink to be put in the image buffer. Color and grayscale 
; are achieved by changing ink color before calling the imaging operators. All ink is opaque. 
even white ink, which is to say that “image priority” is always obeyed. The imaging operators 
use the interpreter’s global “graphics state” variables for such information as current position, 
| ink color. clipping region. line-drawing parameters. halftone parameters. font. transformation 
| a matrix. etc. 
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%!PS-Adobe- 1.0 
72 72moveto 360 576 lineto 
stroke copypage 
newpath 
200 200 moveto 300 400 lineto 
0 -200 rlineto 
400 200 100 180 360 arc 
400 700 lineto 
40 setlinewidth 1 setlinejoin 1 setlinecap 
stroke copypage 
72 72 translate 
360 72 sub 576 72 sub atan neg 90 add rotate 
/Helvetica-Bold findfont 90 scalefont setfont 
70 10 moveto (etaoinshrdiu) show 





Figure 3: PostS 


6. Some Illustrative Examples of PostScript 


Further explanation best awaits some examples. Each of these figures shows a PostScript 
program and a 15%-scale image of the page that it generates. Because of space limitations in 
this volume, these examples are necessarily cryptic: the reader is referred to the PostScript 
reference manual for further explanation [1]. 


Figure 3 shows two pages and the PostScript code that generated them; note the use of 
moveto, lineto, and stroke as active operators. The copypage operator is a debugging operator 
that prints the page buffer and then continues. The default coordinate system is in points, 
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%!PS-Adobe- 1.0 
newpath 
100 100 moveto 400 600 lineto 
100 setlinewidth 1 setlinecap stroke copypage 
400 100 moveto 100 400 lineto | 
50 setlinewidth 0 setlinecap 0.75 setgray 
stroke copypage 
100 100 moveto 400 600 lineto 


30 setlinewidth 1 setgray stroke 
showpage 





Figure 4: Building up a page image from overlays of opaque inks | 


with the origin in the lower left corner. First it draws a thin line. using the default line 
thickness. Then it sets the line width to 40 points, and draws a complex curve. Finally it 
rotates the coordinate system so that the originally-drawn line is the x-axis. then displays some 
characters of text horizontal along that axis. Figure 4 shows the mechanism by which the page 
image is built up from overlays of opaque inks. It sets a very wide line width and draws a 
diagonal line, then sets the ink color to gray and draws a cross-line, then sets the ink color to 
white and re-draws the original line with a narrower width and square corners. Notice that in 
every case the newest “ink” covers the older inks. Figure 5 shows the use of variables, 
functions, and arithmetic. It defines two variables, xcoord and ycoord. then defines a function 
linefunc that will draw a diagonal line at that [x.y], then add 100 to the value of ycoord. The 
four calls to linefunc generate the four diagonal lines shown. Figure 6 shows the effect of 
coordinate system transformations on an image. It defines a function named A, which draws a 
300-point letter “A” when it is called. The example calls A once. then shrinks the coordinate 


DOCUMENTATION GRAPHICS 











72 | REID 





%!PS-Adobe- 1.0 

/xcoord 100 def /ycoord 200 def 
10 setlinewidth 
/\linefunc { 

newpath 

xcoord ycoord moveto 

100 100 rlineto 

stroke 

/ycoord ycoord 100 add def} def 
linefunc copypage linefunc 
20 setlinewidth linefunc copypage 
10 setlinewidth linefunc showpage 





Figure 5: Variables, functions, and arithmetic 


system anamorphically and calls it again, then rotates the coordinate system, changes to a dark 
gray ink, reverses the anamorphic scaling, and calls it again. 


7. Capabilities of Procedural Systems 


Having skimmed the basics of how a procedural page-description scheme works, let us 
turn our attention to some of its capabilities. The power of a procedural page-description 
language comes from: 

e the ability to express geometric shapes in a device-independent fashion. while 

retaining the ability to be device-dependent if necessary. 
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%!IPS-Adobe- 1.0 
/A{ 
| newpath moveto 
100 300 rlineto 
100 -300 rlineto 
-50 130 rmoveto 
-100 Orlineto 
stroke 
} def 


20 setlinewidth 50 50 A copypage 
300 400 translate 0.5 0.25 scale 
50 50 A copypage 

1 2 scale -40 rotate 

0.8 0.8 scale 0.5 setgray 

100 -150A 

showpage 





Figure 6: The effect of coordinate system transformations 


e the ability to define and use new abstractions, and 


e the ability to do “late binding’ of shapes. which permits a defined operator to be 
used for varying effects in varying contexts. 


The first of these capabilities is obvious. and was demonstrated by the Stanford-logotype 
example in section 3. The second of these capabilities is evident to any experienced 
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Figure 7: Example of late binding of PostScript code 


programmer; its virtues need not be further praised. The third virtue —late binding —can best 
be explained with more examples. The production of the PostScript figures in this article is a 
good, though complex example. Consider Figure 7. [t shows a small image of this page, with a 
drop shadow and a thin line around the outside. This figure was generated by extracting (with 
a text editor) one page image from the Scribe-generated PostScript file for this article, 
surrounding it with some redefinitions, including it as a figure in the page, then repeating the 
process. Figure 8 shows those redefinitions, which change the scale factor, produce a clipping 
region exactly equal to the scaled page image, put a drop shadow and a frame outside that 
clipping region, and redefine operators that might interfere. 


This example, though fanciful, demonstrates the flexibility of the procedural scheme. 


The physical nesting of the diagrams is made possible by the recursive nature of the 
PostScript execution environment. The ability to redefine a page image to be a figure within 
itself comes from the ability to specify the page image in terms of late-bound names, i.e. as 
operators whose definitions can be changed. In this PostScript example, the late binding 1s 
achieved by redefining the built-in operators, though the same effect can be had, with a 
certain amount of discipline on the part of the user, by specifying every page in terms of user- 
defined functions that merely call the corresponding system function, then accomplishing the 
late binding by redefining those outermost functions. 


This same technique can be used to advantage in many ways. A PostScript file can be 
wrapped in a set of definitions that will cause it to print as 2 pages per page. or as 4 pages per 
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sii mmm mn a eis 
%!PS-Adobe- 1.0 
0.15 dup scale 
/mydict 100 dict def /inch {72 mul} def 
/pagepath { 
newpath 
00 moveto 8.5 inch 0 lineto 
8.5 inch 11 inch lineto 
0 11 inch lineto 
closepath 
} def 





mydict begin 
| /showpage {} def 
| /nextpage {grestore 9.6 72 mul 0 translate 
| initclip initpage} def 
/initpage { 
gsave 0.3 inch -0.35 inch translate 
pagepath 0.8 setgray fill grestore 
pagepath gsave 1 setgray fill grestore 
O setgray O setlinewidth stroke 
pagepath clip newpath gsave 
} def 
| initpage 
| 2. 
The PostScript page image text goes here 
end 


Figure 8: PostScript definitions for page-image figures 

















page, or with decorated borders, or in white letters on a black background, or with a frame for 
overhead projector slides. As with any other programmable system, it is limited more by 
imagination than by technology. 


References 


[1] Adobe Systems. Inc. PostScript Language Reference Manual. Addison-Wesley. 
Reading. Massachusetts. 1985. 


DOCUMEN FATION GRAPHICS 





716 


[2] 


[3] 


[4] 


(5] 


[6] 


[7] 


REID 


Earnest, Les. “Would you want your daughter to be device-independent?” ARPANET 
Laser-lovers distribution, March 1985. 

—. Interpress Electronic Printing Standard. Xerox Corporation, Stamford, Connecticut 
06904, 1984. Document number XSIS 04804. 

Newman, William. “Press: A flexible file format for the representation of printed 
images.” in Actes des Journees sur la Manipulation de Documents, Rennes, France, 5 
May 1983. 

R. Reddy, B. Broadley, L. Erman, R. Johnson, J. Newcomer, G. Robertson, and J. 
Wright. “XCRIBL, a hardcopy scan line graphics system for document generation.” 
Technical Report, Department of Computer Science, Carnegie-Mellon University, 
October, 1972. 

Reid, Brian K. “PostScript and Interpress: a comparison.” ARPANET Laser-lovers 
distribution, March 1, 1985. | 

Ryland, Chris. /mprint System Manual. Imagen Corporation, 2660 Marine Way, 
Mountain View California 94304 USA, 1983. 


SIGGRAPH’86 TUTORIAL COURSE NOTES 





INTERPRESS PAGE AND DOCUMENT DESCRIPTION LANGUAGE 77 


[Republished from /EEE Computer 19(6):72—77, June 1986] 


The Interpress Page and 
Document Description Language 


Abhay Bhushan and Michael Plass 
Xerox Corporation 


: Potential users of computer-aided documentation systems almost inevitably face the 
problem of how to output document files created on a diverse set of tools, including 
workstations, mainframes, word processors, and scanners, to a broad range of shared printing 
devices, from low-speed desk top printers and low-resolution proof printers to high-speed 
document production systems and high-resolution phototypesetters. More often than not, 
output devices each require different formats to characterize a document's design, making it 
virtually impossible to achieve any consistency of appearance between versions of the 
document printed on different devices. They may also require unique interfaces with device- 
dependent control codes, protocols, and configuration requirements. Such demands 
effectively prevent the user from exploiting the full capabilities of advanced printing 
technology. The Interpress page description language addresses this problem with a device- 
independent interface and a format description methodology that make it possible to 
efficiently employ a full array of output resources in a manner that is transparent to the user. 








Toward device independence 


Computer-driven raster printers are inherently expressive devices. capable of printing 
any imaginable combination of text, graphics, and pictures simply by arranging the 
appropriate pattern of black (or colored) dots. Unlike character- or line-oriented devices. they 
are limited in the range of images they can print more by the expressive power of the interface 
to the document creation tool than by their own capabilities. 
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Because raster printers are driven by computer software and do not use device-specific 
control codes, it is possible to have a universal interface or interchange standard in which any 
document may be represented, and that can drive any raster printer, independent of its 
resolution (expressed in dots per inch) and other device characteristics. Interpress attempts to 
be such a device-independent standard. Since Interpress is so conceptually different from 
conventional printer interfaces, it is generally referred to as a page and document description 
language. The Interpress printing architecture is a software architecture that represents a 
unified document description scheme for printing in diverse system environments, including 
stand-alone computers, data processing centers, publishing and office work groups, and large, 
interconnected networks. 


When work first began on Interpress over a decade ago at Xerox Palo Alto Research 
Center, the original goal was to design a system that permitted the same format to be used for 
editable (revisable form) as well as printable (final form) documents; but early in 
development this goal was seen as impractical and was discarded in favor of separate 
interchange schemes for revisable and final form documents. Interpress emerged as a 
language for final form document representation. The language was made publicly available 
in 1984 (an earlier version of the current Interpress 3.0), and Xerox is committed to keeping it 
an open system. 


The Interpress approach 


In one possible document description scheme, the document creation device would send 
to the raster printer a facsimile picture of the intended output, presented directly in that 
printer's raster format. Such a scheme, though conceptually simple, has many disadvantages. 
First, raster data, which amounts to a map of every point that the printer must address, takes 
up an enormous amount of storage space (millions of bits per page. even when compressed). 
This increases not only storage costs but also transmission time and costs. Further, if the page 
contains text, the computer program generating it must have access to the raster image of all 
of the fonts for all printers that might be used; such a requirement is quite impractical. It 
would also be difficult to transform the raster images--to rotate, scale, or move them to fit a 
particular space or to achieve a desired effect. Finally, the raster format is based on the 
resolution of the printing device and is therefore not device-independent. 


By contrast with this pictorial approach, Interpress represents a page not as a set of points 
for the printer to address but as a series of instructions analogous to a computer program. This 
technique permits the user to print substantially more complex documents than is possible 
with the static format specifications of character printers, and to do so with greater efficiency 
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"The 65th character in the 
font named Classic 36" 


TEXTUAL 


Figure 1. Three different representations of the letter “A”. 


than is possible with conventional raster data. 


Figure 1 illustrates this approach to representing the letter “A.” The _ pictorial 
representation takes several thousand bits at typical printer resolutions, the geometric 
representation takes several hundred bits, and the textual representation requires only eight 
bits once the font has been specified. While Interpress permits all three approaches, the 
textual approach is preferred in most applications. The geometric approach of outline 
characters is useful when unique character sizes are desired or when the characters need to be 
rotated at unusual angles. | 


The program that drives the printing machine to produce the finished output document 
is called an Interpress master. written in the Interpress programming language. Although 
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most masters consist only of simple statements, such as text and vectors, the full power of the 
programming language is available for complex applications. Programming is useful if the 
master is to adapt to the various properties of the printing device (such as page size, order of 
page printing, color or black and white, etc.) and to change Interpress masters in complex 
ways. 


Though a key task, describing pages is not enough. What is communicated to the printer 
usually is a document, which is a collection of one or more pages produced in a specific order 
and intended to have a specific relationship with one another in the document's final form. A 
document description language therefore must be able to specify how pages are put together. 
The Interpress design includes a set of printing instructions that enables the user to control 
the printing of documents--to invoke two-sided printing, for example. or a special finishing 
such as stapling. Printing instructions also provide information necessary for multi-user 
environments (the document's name, author, etc.) and enable the declaration of resources 
required for the the document to be printed (e.g.. additional files, fonts, and font sizes). 


The Interpress language 


Like all programming languages, Interpress has both syntax and semantics. The 
semantics of the language define how the various operators behave when they are executed by 
the printer; the syntax of the language defines how the calls to those operators are coded in a 
master. Since Interpress masters are intended to be created and interpreted by software and 
not by people, the language syntax was designed to make it easy for computers to produce 
and interpret, without concern for human readability. As such, Interpress commands are 
normally encoded in a binary format designed for compactness and decoding ease. For 
debugging purposes, utility programs translate the binary encoding to and from a human- 
readable text representation; this “written encoding” is used in the examples in this article. 


The software within a printer interprets (executes) an Interpress master to print a 
document. During that execution, the printed document is built up one page at a time. When 
a page of a master is printed. an interpreter “executes” the code that constitutes the page 
description, much as a machine executes a program. The state of this virtual machine includes 
a set of 50 “registers” called the frame, a (potentially large) stack, and a set of special imaging 
variables. This computational environment is shown in Figure 2. 


The elements stored in the frame and stack are Interpress va/ues, which may be of type 
number (integer or real). identifier (similar to atoms in other languages). vector (a packaged 
sequence of values), body (a packaged piece of Interpress code). or operator (an Interpress 
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] : Imager variables 








Figure 2. The Interpress computational environment. 





program that can be executed). Character codes are represented by integers, and character 
strings by vectors of integers; there are compact encodings available for this kind of vector to 
keep the master compact. Additionally, there are special types that are constructed and used 
only by imaging operators, such as color. transformation. pixelArray. font. trajectory. and 
outline. 


Interpress uses a postfix execution model: the occurrence of a literal in the instruction 
stream simply causes the corresponding value to be pushed on the stack. and an operator may 
pop some number of parameters off the stack and push some number of results. There are 
provisions for marking the stack to ensure that defined procedures (composed operators in 
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Interpress parlance) pop and push the expected number of items. There is also a way of saving 
and restoring some or all of the imaging variables around the invocation of an operator. 
Furthermore, composed operators do not read or modify the caller’s frame; they have their 
own frame, initialized when they were created and the same at each call. Thus the side effects 
of composed operators are well controlled. 


The Imaging Model 


Of course the most important side effects are those that change the image on the page. 
The software maintains a page image, which is altered by the imaging operator as the page is 
built. A complex page is made by starting with a blank page image and making a sequence of 
simple changes to it. Interpress is defined so that the partially built page image cannot affect 
the execution of the master; this makes [nterpress useful for expressing images destined for 
non-raster devices such as pen-plotters and certain phototypesetters. All of the changes to the 
page image are made according to the Interpress imaging model illustrated in Figure 3: a 
color is instanced through a mask onto the page image, covering up (or perhaps altering) what 
was there before. In Figure 3, the color (represented by the parallelogram) is set with 0.5 
SETGRAY, and the “b” on the page image is printed using the filled outline mask defining the 
character's shape. 


Colors 


Interpress uses the term “color” to designate a concept that is more general than what we 
mean in our day-to-day use of the term. An Interpress color may be simply black or white, or 
a shade of gray, or a “real” color like red or blue, or a generic color like “highlight,” which 
just means something other than black; these are all examples of constant colors. Constant 
colors may be specified by means of the SETGRAY operator (for simple grays). or from the 
printer’s environment (by supplying a name to the FINDCOLOR operator). or via a color model, 
which is simply an operator that accepts a vector of numeric parameters on the stack and 
returns a color on the stack. (Color models are normally obtained from the environment by 
means of the FINDCOLORMODEL or FINDCOLORMODELOPERATOR operators). 


The other kind of Interpress color is a sampled color, which consists of a rectangular array 
of pixel descriptions, along with the specification (color model) for how the pixels are to be 
interpreted and a transformation to specify how the pixels of the sampled color are to 
correspond to the pixels of the page image. The sampled color conceptually tiles the whole 
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Previous 
page image | 





Color “ , Page image 
Figure 3. The Interpress imaging model. 


page, so it may be used to produce textures and wallpaper-like effects, as well as full-color 
continuous-tone images. The simplest kind of sampled color consists of a one-bit-per-pixel 
array, with 1 denoting black and 0 denoting either white or clear; this is known as a sampled 
black. Interpress allows various types of compression to be used in the encoding of the arrays 
of pixels; the ones that are currently defined in the system's Raster Encoding Standard are 
expressly for the one-bit-per-pixel case. | 


Masks 


The other half of the imaging model is the notion of masks. A mask is simply a two- 
dimensional shape used as a Stencil in the application of color to the page image: it specifies 
the portion of the page image to be colored. A mask may be a rectangle. a stroke of a specified 
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width along polygonal or curved trajectories, a dashed or dotted stroke, an area bounded by a 
set of trajectories, a bitmap (possibly compressed) at any resolution, or a string of text in a 
particular font. Any of these masks may be used with any color. The current color is one of 
the imaging variables, as is the current font and the current transformation (more about these 
later); the different kinds of masks have different operators. 


A simple example 


To illustrate some of the basic features of Interpress, consider the example in Figure 4. 
This master will print two pages; lines 1 through 7 specify the first page (containing a 
rectangular box made up of four strokes) and lines 8 through 13 specify the second 
(containing the text “Print this”). Lines 0 and 14 constitute the skeleton that brackets the page 
bodies (the empty brackets in line 0 are the preamble, which can set up the initial frame used 
for each page body; thus fonts, for example, must be declared only once rather than in each 
page). If the page brackets on lines 7 and 8 were eliminated, only a single page would be 
printed with “Print this” inside the rectangular box, as shown on the right in Figure 4. 


The operations that actually change the page images occur in lines 3 through 6 
(MASKVECTOR) and line 12 (SHOW); the other lines set up the necessary imager variables 
(strokeWidth for the MASKVECTOR, font and current position for SHOW). All the dimensions 
in this example are expressed in meters, which is the unit of the initial coordinate system of 
Interpress; the origin is in the lower left corner of the page, with x increasing to the left and y 
increasing towards the top of the page. 


Typographic printing 


Typography is the art of designing and placing letterforms to create a legible and pleasing 
effect. The quality of typography largely depends upon the availability of character sets and 
fonts. The Interpress printing architecture includes the multilingual Xerox Character Code set 
and the Font Interchange set. Interpress also allows other fonts and character codes to be used 
and intermixed freely in a master. For example, the ASCII formatting characters such as 
carriage return and tab are not recognized by the system because there is no accurate way to 
interpret what these characters should do. Formatting operations are achieved by positioning 
commands such as SETXY. Interpress also does not generate ligatures automatically. Ligatures 
must have a separate character code and representation (provided in the Xerox character code 
set). This reflects an important principle of Interpress design: all decisions about presentation 
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Line 
Number 
(ref. only) Interpress Master Comments (for reference only) 
--0-- BEGIN { } part of the “‘skeleton”’ 
waj>- { beginning of the first page body 
<2 aS 0.001 15 ISET set imager variable 15 
(strokewidth) to 0.001 meter 
“<3 0.0254 0.2286 0.0254 0.254 MASKVECTOR 
--4-- 0.1905 0.2286 0.1905 0.254 MASKVECTOR 
--§-- 0.0254 0.2286 0.1905 0.2286 MASKVECTOR 
~<§-- 0.0254 0.254 0.1905 0.254 MASKVECTOR 
--]-- } end of the first page body 
--8-- { beginning of second page body 
line 9 defines font and saves it ir 
frame element 0 
<n [xerox xci-1-1 modern] FINDFONT 0.0127 SCALE MODIFYFONT 0 FSET 
--10-- 0 SETFONT sets the “current font” 
--11-- 0.08128 0.23622 SETXY sets the “‘current postition” 
--12-- <Print this> SHOW place ‘Print this” at current 


position in current font 


==“13-- } end of second page body 
--14-- END end of master (more of the 
“skeleton”’) 
Two pages images produced One page image produced 


by this Interpress master. 


Print this 





by this Interpress master 
without lines 7 and 8. 





Figure 4. A simple Interpress master and the two-pages that it describes. 
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and formatting should be made by the creator, not the printer. This principle ensures that 
documents are printed accurately and have uniform appearance on different output devices. 


Further facilities include letterform definitions expressed as character operators: 
positioning operators used to control the position of the letters; geometric transformations to 
scale, rotate, and translate a letterform so that it can appear in arbitrary size, rotation, and 
position on a page; and additional graphical operators to define underlines, strikethroughs, 
and the like. Positioning may be either absolute (with respect to the page) or relative (with 
respect to some other coordinate system such as a box within a page). To assure correct 
justification and margin alignment, the system provides a CORRECT operator for spacing 
adjustment. It also defines a flexible way to achieve kerning (alteration in the intercharacter 
Spacing of pairs of characters) to achieve a better appearance. 


Fonts may be stored at the printer in outline or bitmap forms, or they may be 
communicated as part of an Interpress master. The bitmap fonts generally are fine-tuned for 
the characteristics of a specific printer and represent high typographic quality. Outline fonts 
represent greater versatility since they can be easily scaled and rotated to provide printing in 
any size and at any angle. 


Graphics printing 


In some sense, the term “graphics” applies to everything that goes on the page; here, 
though, the term applies to elements other than black text at normal sizes and orientations 
and the rectangles that normally appear with such text. 


Rectangles are very simple to specify: the operator MASKRECTANGLE expects the four 
variables x, y, w, and / on the stack (x,y coordinates of one corner, the width, and the height). 
Single line segments, as we have seen, can be drawn by supplying the two endpoints (four 
numbers xl, yl, x2, y2) to the MASKVECTOR operator; Interpress allows control of the stroke 
width and the style of the end caps (square, butt, or circular). More complicated shapes are 
specified by building trajectories, representing a sequence of straight or curved segments 
connected end-to-end. A trajectory is constructed segment-by-segment, using a pen-plotter 
analogy: the MOVETO operator takes the coordinates of a point (two numbers x0. 0) and 
returns a single-point trajectory. The other trajectory-building operators all take a trajectory 
followed by the appropriate number of coordinates and parameters. and return a new 
trajectory. A trajectory is an Interpress value and may be copied and saved for reuse; the most 
common pattern, though. is to construct it on the stack and use it nght away. The other 
trajectory-building operators are: 
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x1 y1 LINETO straight line segment from (20,0) to (x2,y2); 

x1 y1 x2 y2 r CONICTO conic segment--part of circle, ellipse, parabola, or hyperbola; 
x1 y1 x2 y2 ARCTO circular arc passing from (x0,y0) through (x1,y1) to (x2,y2); and 
x1 y1 x2 y2 x3 y3 CURVETO parametric cubic curve ending at (x3,y3). 


Once a trajectory is built, it can be supplied to the MASKSTROKE operator to draw a 
constant-width stroke; the master may specify whether the joints between the segments 
should be mitered, beveled, or rounded. For a dashed or dotted stroke, the trajectory and a 
dash specification is provided to the operator MASKDASHEDSTROKE; fancy combinations of 
dashed and dotted strokes can be obtained by using the same trajectory with several different 
sets of dash specifications. MASKSTROKECLOSED draws a stroke with the segment closed by 
joining its two endpoints (with a line segment, if necessary); this allows the proper joints to be 
made in place of the end caps. 


Several trajectories can be combined to form an outline by using the MAKEOUTLINE 
operator; an outline represents a filled geometric shape. which may have holes (using multiple 
trajectories that wrap in Opposite directions). An outline can be used with the MASKFILL 
operator to fill the region with the current color. It can also be used as an argument to the 
CLIPOUTLINE operator to provide a clipping region for all subsequent masking operations 
(until the imager variables are restored); and a simpler CLIPRECTANGLE operator is also 
available. 


The MASKPIXEL operator uses a bitmap as a mask, represented in the same form as the 
MAKESAMPLEDBLACK operator. Optionally, it can be in a compressed format. 


Figure 5 shows an example that uses many of the Interpress graphics primitives to draw 
an ice cream cone. The written encoding of the entire master is shown along with comments. 
You may want to try plotting the positions of the vertices and control points on a piece of 
graph paper to get a better feel about how they work. 


One of the more useful capabilities of Interpress is the set of geometric transformations 
that are at the heart of the imaging operations. Interpress has a set of operators for building 
primitive transformations (SCALE, ROTATE. TRANSLATE) and two operators for combining 
transformations (CONCAT, CONCATT). Mathematically, the transformations can be represented 
as 3X3 matrices. Transformations can be applied to any graphic image. including line 
drawings, scanned images. and character shapes. The transformation capabilities are also 
useful in publishing applications such as two- sided printing of bound documents and 
creation of n-up printing signatures. 
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Interpress Master 


BEGIN {} { 


END 


1 5 ISET 
2/1000 SCALE CONCATT 


3/5 SETGRAY 
70 75 MOVETO 50 25 LINETO 55 75 LINETO 
1 MAKEOUTLINE MASKFILL 


3/10 SETGRAY 
55 75 MOVETO 50 25 LINETO 30 75 LINETO 
1 MAKEOUTLINE MASKFILL 


2 16 ISET 
0 23 ISET 
2 15 ISET 
1 SETGRAY 
70 75 MOVETO 50 25 LINETO 30 75 LINETO 
MASKSTROKE 


30 75  MOVETO 

50 25 70 75 1/10 CONICTO 
80 80 80 90 72 87 CURVETO 
50 122 28 87 ARCTO 

15 90 30 75 1/2 CONICTO 
DUP 

1/5 SETGRAY 

1 MAKEOUTLINE MASKFILL 

2 23 ISET 

1 SETGRAY 

MASKSTROKECLOSED 


} 


Comments (for reference only) 


will use overlapping shapes, 
So set priorityimportant to TRUE 
use 2 millimeter units 


The Cone 

darker gray side of cone 
a triangular trajectory 
fill the darker triangle 


light side of cone is similar 


round ends on the stroke 
miter to get a sharp-point 
stroke will be 2 units wide 
set current color to black 


line around cone bottom 


The Ice Cream 


elliptical piece at bottom 
parametric cubic at right 
circular arc at the top 
parabolic piece at left 

make a copy of trajectory 
light gray for ice cream 

fillin the area 

use rounded joints for outline 
set current color to black 
draw stroke outlining area 


Figure 5. An example of Interpress graphics primitives to draw an ice cream cone. 
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Open system architecture 


Interpress has evolved from many years of use in a distributed network environment. 
The Interpress architecture, with its functional richness in describing pages and documents 
and its device-independent capability, represents an effective solution to the problem of 
printing the wide variety of documents in diverse printing environments with a common 
standard interface. Interpress has proven to be gracefully extensible so that as new printing 
technology, new applications, and other new requirements emerged, existing masters could be 
| used unmodified. 





| Xerox has chosen to make the [nterpress printing architecture a completely open system. 
| Interpress and related systems have been made publicly available, free of any royalty or 
licensing fees. 
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BEGIN 


This is the preamble; the frame constructed here is used as the initial frame for all the page 
bodies (only one in this example). Here we just put a font for the copyright notice. 
Xerox xci-1-1 Modern 3 MAKEVEC FINDFONT 10 SCALE MODIFYFONT O FSET 
} 


Start of the page body; set priorityimportant and provide a transformation for the whole page 
1 5 ISET 
1 SETGRAY 
10795/100000 13970/100000 TRANSLATE CONCATT 
1/3000 SCALE CONCATT 
Make a composed operator for a single cone: this is like the previous example except that it is 
transformed to its proper place relative to the center of the spiral. 
MAKESIMPLECO { 
0 200 TRANSLATE CONCATT 
-22 ROTATE CONCATT 
-50 -25 TRANSLATE CONCATT 
3/5 SETGRAY 
70 75 MOVETO 50 25 LINETO 55 75 LINETO 1 MAKEOUTLINE MASKFILL 
3/10 SETGRAY 
55 75 MOVETO 50 25 LINETO 30 75 LINETO 1 MAKEQUTLINE MASKFILL 
2 16 ISET 0 23 ISET 2 15 ISET 1 SETGRAY 
70 75 MOVETO 50 25 LINETO 30 75 LINETO MASKSTROKE 
30 75 MOVETO 50 25 70 75 1/10 CONICTO 80 80 80 90 72 87 CURVETO 
50 122 28 87 ARCTO 15 90 30 75 1/2 CONICTO DUP 
1/5 SETGRAY 1 MAKEOUTLINE MASKFILL 
2 23 ISET 1 SETGRAY MASKSTROKECLOSED 
} 10 FSET 
Make a composed operator for a ring of six cones. 
MAKESIMPLECO { 
10 FGET DOSAVE -60 ROTATE CONCATT 
10 FGET DOSAVE -60 ROTATE CONCATT 
10 FGET DOSAVE -60 ROTATE CONCATT 
10 FGET DOSAVE -60 ROTATE CONCATT 
10 FGET DOSAVE -60 ROTATE CONCATT 
10 FGET DOSAVE -60 ROTATE CONCATT 
} 11 FSET 
Do the above, but with a scale and rotate to do the next smaller set. 
MAKESIMPLECO {11 FGET DO 9/10 SCALE -15 ROTATE CONCAT CONCATT} 
12 FSET 
Do that many times to draw the spiral 
DOSAVESIMPLEBODY { 
MAKESIMPLECO {12 FGET DUP DO DUP DO OUP DO DUP DO DO} 
DUP DO DUP DO DUP DO DUP DO DUP DO DUP DO DUP DO DUP DO DO 


Finally, show the copyright notice. Octal 323 is the code for the copyright symbol. 
1 SETGRAY 0 SETFONT -300 -300 SETXY 
«Copyright \323 1986, Xerox Corporation. Ai? rights reserved.> SHOW 
END 


Figure 6. The written Interpress commands to draw a spiral of ice cream cones. 
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Copyright ” 1986, Xerox Corporation. All rights reserved. 


Figure 7. A spiral of ice cream cones. 
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ABSTRACT 


This paper starts by tracing the architecture of document preparation systems. Two 
basic types of document representations appear: at the page level or at logical level. 
The paper then focuses on logical level representations and tries to survey three 
existing formalisms: SGML, Interscript and ODA. 


1. Introduction 


Document preparation systems might be now the most commonly used computer 
systems, ranging from stand-alone text processing individual machines to highly sophisticated 
systems running on mainframe computers. All of those systems internally use a more or less 
formal system for representing documents. Document representation formalisms are very 
different according to their goals. Some of them define the interface with the printing device. 
they are oriented towards a precise geometric description of the contents of each page in a 
document. Others are used internally in systems as a memory representation. Yet others have 
to be learned by users; they are symbolic languages used to control document processing. 


The trouble is that there are today nearly as many representation formalisms as 
document preparation systems. This makes it nearly impossible. first to interchange 
documents among heterogeneous systems. second to have standard programming interfaces 
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for developping systems. Standardization organizations and large companies are now trying 
to establish standards in the field in order to stop proliferation of formalisms and facilitate 
document interchange. 


This paper focuses in the last sections on three document representation formalisms often 
called ‘revisable formats’, namely SGML [SGML], ODA [ODA], and Interscript [Ayers & al.], 
[Joloboff & al.]. In order to better understand what is a revisable format, the paper starts with 
a look at the evolution of the architecture of document preparation systems. 


2. Architecture of document preparation systems. 


Document preparation systems have appeared as soon as computer printing devices were 
able to output typewriter-like quality documents. Although the evolution of printing 
technology have been the major one, several factors have influenced the architecture of 
document preparation systems: low cost computing power, distributed systems, and the 
simple maturation of ideas in the field. The evolution of printing technology has lead to the 
digital representation of documents ready to be printed, called final form representation. The 
evolution of software techniques has principally lead to representations capturing the logical 
Structure, the structure that is perceived by the author when the document is revised, 1.e. 
constructed or modified. 


2.1 Final form representation 


On early document preparation systems, printing devices were basically typewriter-like 
terminals directly connected in character mode to the unique processing computer. Those 
devices were driven by sequences of control characters inserted in the data stream they 
received in order to produce layout rendition (underlining, overstriking). A formatting 
system basically had to translate the formatting commands into printer control sequences. 


As printers from different vendors had different control sequences, device independent 
formats were needed in order to print the same document on different sites with different 
printers. Final form representation had appeared. that is, the final digital representation of a 
document before it is printed. The main property of a final representation is that the number 
of pages in a document has then been computed. The way each object (character string or 
graphics) should appear on the page is totally determined. 


On non impact printers, virtually any image is reproducable: characters in any alphabet, 
graphics and images as well. There do not exist any more a Specific set of imaging functions 
available from the hardware. Then the limit to the expressive power of the page creator is set 
by the software interface. 
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This fundamental change brought by technology has implied a fundamental change in 
the design of final form representations for non-impact printing. A final form representation 
is not any more a sequence of characters, it has to be an organized structure. A formal 
method must be used to describe the page layout, offering a maximum expressiveness to the 
page creator. Such formalisms theoretically allow for the description of any page for any 
printer. 


They divide into static formats and dynamic ones, more recent. In a static format the 
page layout is described as a static data structure. The standard CCITT T73 [T73] is a typical 
example of such formats. [n dynamic formats, also referred to as procedural page description 
languages, a page description actually describes how to compute the layout. 


Brian Reid’s paper [Reid86] in that very conference talks more extensively on procedural 
page description languages, such as PostScript{PostScript]. The point we want to emphasize 
now is that the architecture (figure 1) of document preparation systems has now a clean 
interface with printing devices. It generates a final form representation of documents in terms 
of a structured page description formalism. 


2.2 Revisable form representation 


‘A document has to undergo many additions or modifications before it is ready to be 
printed. Working on a page based representation when editing a document would be tedious 
and cumbersome both for users and the editing system. An unformatted representation of 
documents is necessary. This representation typically is the output of the editing system and 
the input of the formatting system. Figure 1 shows the three basic components of a document 
preparation system: editing, formatting and printing. Revisable form and final form are the 
two representations interfacing these components. 


The first document preparation systems have naturally imitated the method used in the 
publishing industry for typesetting: additional information is interspersed among the 
document contents to produce a data stream directly processed by the typesetting device. On 
those early systems the revisable form representation simply consists of a text file containing 
control sequences, directly keyed in by the user from a standard terminal. 





TEXT FINAL 
REVISABLE _/ cormaTTING PRINTING 


PRODUCTION FORMAT FORMAT 


Figure 1. Typical architecture of a document preparation system. 
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Control sequences consists of a series of markup signs. That was the beginning of so 
called procedural markup languages, since those markup signs were interpreted as instructions 
controlling subsequent processing in the formatting system. 


Procedural markup has well known inconvenients: 


e the logical structure of a document is not much evidenced once the document is 
marked up. For example, if chapter titles have been marked with a centering 
command, it does not appear clearly that what follows a centering command is a 
title. If someone later wants to flush all titles nght, changing all centering 
commands into flush commands will probably not give the expected result. 


e the style of the resulting documents, i.e. the aspect of the document layout, is 
determined by the user who placed the markup signs. A good layout style, if some 
style at all, requires from the user some typographic knowledge. The lack of this 
knowlege is responsible for all of the ugly documents produced on procedural 
markup systems... Also, it makes it difficult to output the same document in a 
different style. 


Disavantages of procedural markup have been avoided with a new method, known as 
declarative markup. The standpoint in declarative markup is that the user should describe the 
logical structure of a document, what is to be processed rather than how the document 
content is to be processed. A user enters mark up signs indicating logical properties of data, 
for example paragraph or heading, expressing its logical structure, which sounds more 
familiar, and does not imply a particular processing. The responsibility of making consistent 
styles, or applying specific functions is left to the system. GML [Goldfarb] and 
Scribe[Reid83] are two examples of declarative markup systems; the reader is referred to 
[Furuta & al.] for an extensive survey of such formatting systems. 


The SGML formalism is essentially the definition of an international standard by ISO for 
covering these systems. Yet aSGML entity may refer to non-character data, as shown in the 
next section, it has been designed in the spirit of all markup systems. As the standard says 
(page 3) “The millions of existing text entry devices must be supported. SGML documents 
can easily be keyboarded and understood by humans.” 


A user does not need a specific editor to build a markep up document. As far as there are 
only characters, any editor will do on any standard terminal. The revisable form 
representation of a document in a markup system, be it declarative or procedural, is (or 
should be) fully known from users, they have to key it in... 


More recent approaches have a different viewpoint. They assume the revisable form 
representation is not directly accessed by users. but solely by the editing system. Thus a 
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specific editor is needed, which generates that representation. It is intended such editors will 
not expose users to the revisable representation; that they will actually hide to the user the 
internal representation of documents, constructing themselves this representation from the 
user input. 


These editors are expected to provide a more convivial user interface. Most of the editors 
from this new generation do not run on standard terminals, for example Grif, presented in 
this conference [Quint & Vatton]. They rather use bitmap display terminals, a window system 
and a pointing device. 


The new type of document representation used in this approach may then be designed to 
be quite complex, nearly unmanageable by human beings, but very suitable to be handled by 
computers. Graphics and images may be directly inserted in documents more easily than for 
markup formats. Graphics may rely on existing standard graphics representation, images may 
be stored trough specific data compression techniques, while the user only sees on the screen 
a real layout. 


Interscript and ODA both belong to this new genereration of formalisms. They assume 
more computing power from the editing system, they lose the possibility to be directly entered 
from a standard terminal, but promise many more possibilities. 


3. Generalized Markup Language 


SGML stands for Standard Generalized Markup Language. It is essentially a declarative 
markup language, which has inherited mainly from its ancestor GML. However it includes a 
lot of new interesting features. 


A first difference with its predecessors is that markup is defined rigorously. It is possible 
from the SGML standard definition to build a general syntactic parser that will not arise 
ambiguities. According to this rigorous syntax, SGML documents may be processed very 
much like programs by a compiler. A document may be parsed to build an abstract syntactic 
tree together with its attributes. 


Semantics of that tree may be evaluated by semantic functions according to the attributes 
values. Thus, SGML can be used for other tasks than formatting ones. Semantics of markup 
tags and attributes might be used for machine translation. automatic indexing or any other 
process needing parsing of documents. 


A markup sign in SGML is named a tag. Any element which needs to be tagged starts 
with a Start-tag and ends with an end-tag. Any tag is delimited by the characters < and >. A tag 
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is defined by an identifier, which appears first in the start-tag. An end-tag repeats the same 
identifier preceded by /. Note that all of these mark characters are redefinable for each 
document. 


End tags may also be omitted under conditions specified in the standard. For example, a 
paragraph will appear as: 
<p>This is a short paragraph.</p> 


A drawback of usual declarative markup systems is that one is forced to use the catalog of 
markup tags which is offered by the system. Since markup tags express the logical structure of 
documents, it means one cannot define the logical structure in other terms than the general 
tags set up once for all by the system. 


A property of SGML is that tags are themselves described trough a formal language: the 
SGML meta-language, which may be used within SGML documents to dynamically define 
new symbols. Syntax to introduce a meta-language construct simply follows < by !. 


The SGML meta language allows for the definition of complex constructs, named 
elements. An element declaration defines of a class of objects, i.e. an element type. 
Subsequent objects in the document may be tagged with the element name. Elements may 
have a hierarchical structure, and each element in the hierarchy may have its own attributes. 
Element types may be used either to facilitate the interactive creation of documents, to 
control the validity of a document structure, or to associate a layout style to a particular 
document type. 


For example, one might define a document type for a conference paper as follows: 


<IELEMENT 

1 paper (title abstract sections) 
language CHARS 

2 title (#CDATA) 

3 abstract (p) 

4 body (p*) 

2 


This document type declaration specifies that a paper has a title, an abstract, and a body. 
The title consists of characters, the abstract is one paragraph and the body one or more 
paragraphs. A paper has a language attribute to indicate in which language it is written. More 
complex combinations can be designed to define document types that have some 
commonality. 


The facility to define new elements brings troubles when laying out those elements. 
because the formating system then does not know how to format such constructs. SGML 
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provide two ways for handling that situation. The first one is naturally to add to the SGML 
system a procedure to take care of the new tags. This requires a good knowledge of the system 
and prohibits further interchange of documents with such tags to systems which do not have 
this procedure. The second one is to use a LINK tag. A LINK tag says to the system that a 
construct should be handled as another one, presumably known from the system, with 
possible attributes modifications. For example, if one says <!LINK abstract paragaph 
indent =5>, it means an abstract has to be formatted like a paragraph, however using a 
different indentation value. 


It is often required in a document to be able to refer to other parts of the document. 
Some binding mechanism is needed in the formalism to attach a value to some identifier, 
which resembles to progamming language variables. Binding is achieved in SGML trhough 
entity declaration and entity references. An entity (a value, a character string or any valid 
SGML constituent) may be bound to a name by the notation <!ENTITY name entity value>. 
From now on, that entity may later be referenced by its name either to set an attribute value, 
or to be included into the running text. Entities also provide means to handle non character 
data. An external entity is declared <!ENTITY name SYSTEM system information>. Then it is 
known that this entity is not in the document stream. The processing system will find in the 
system information how to access that content. 


If the document is to be interchanged among different computers with different 
operating systems, this system information is specific to each system. SGML provides an 
IGNORE/INCLUDE mechanism for that purpose. Information relative to some particular system, 
let say osx, has to be encoded within the magic declaration <![osx;[<?commands for osx 
system>]]>. Then a user only needs to turn a switch at the beginning of the document to the 
local system for the document to be processed correctly. 


4, Interscript 


We mentioned previously [nterscript is a representation formalism from a new 
generation. Interscript, which was originally designed at Xerox PARC. starts from the idea 
that a document representation should be suited to be processed by computers, not by the 
humans who manipulate documents. 


Such things as traversing trees, evaluating expressions. searching values of variables 
within contexts are among what computers can easily do. Thus, a fundamental notion in 
Interscript is to rely on a formal language to describe document constructs. not only a 
document logical structure. but all formal constructs that could be necessary into a document 
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representation. These abstract constructs may be data structures such as paragraphs, fonts, 
geometric shapes, but may also represent computations, like setting a context or evaluating 
expressions within some context. 


The Intescript approach is very much like the approach used in software engineering: 
general programming languages are used by people to build abstract constructs and 
procedures to solve their particular problem. A document representation problem should be 
solved using the a document representation language. The Interscript base language is simple 
(around 25 grammar rules) and powerful. Its semantics are well defined but its syntax rapidly 
leads to document that cannot be managed by humans. 


A document encoded in the Interscript base language is called a script. A Script is very 
much like a program. The processing paradigm (figure 2) is that a script should be first 
internalized by a system. Internalizing a script implies execution of computations, which are 
dictated only by Intercript base language semantics, and result in the construction of another 
representation available for the client process. 


This simply means that one translates a standard disk representation into a non standard 
memory representation, while achieving computations. 


Computations are necessary in the internalizing process because the base language 
includes a binding mechanism and the evaluation of expressions within hierarchical contexts. 
For example, evaluating the expression: 


rightmargin = leftmargin + linelength 







internalized 
document 


- internalization . - externalization . 





input script Output script 


Figure 2. Interscript processing model. 
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needs to obtain the values bound to the variable names. 


We will not in this paper enters into the details of the internalizing process, which looks 
like the evaluation of any interpreted programming language, to focus on the central concepts 
of node and tag. 


A script is a hierarchy of nodes. Nodes have contents and tags. The authors have 
compared an Interscript node to a bottle of wine. The contents of the bottle is qualified by 
several tags on the bottle: a price tag, a product number tag. Interscript tags similarly 
qualifies the node contents. To some extent an I[nterscript tag is similar to an sgml tag, it 
introduces an element, it has attributes, it denotes structural properties of the contents. 


The difference is that, first a Interscript node may have simultaneous tags, second 
attributes of a tag may be bound to an expression which must be evaluated. For example, a 
figure caption could be affixed with both a CAPTION and a PARAGRAPH tag. The paragraph tag 
says that the caption text has to be laid out as a parapragph. the caption tag restricts the 
placement of that paragraph relatively to the figure picture. The leftmargin attribute of the 
paragraph might be set to be equal to the margin of some object X. Then the node hierarchy 
is searched for that X. 


Interscript syntax denotes nodes between curly braces. Tags are character strings 
followed by a dollar sign. A typical node is: 


{ PARAGRAPH$ PARAGRAPH .leftmargin = 10 
{CHARSS <paragraph text content> }} 
4.1 The pouring process 


Markup languages do not provide good support for describing layout. They start from 
idea that a user should hardly be able to specify layout. in order to enforce style discipline. It 
is true that users of a document preparation system are usually not interested in setting line 
and page breaks, selecting fonts, positioning titles, etc. However, they are often concerned 
with placement of logos. page numbers, whether there are one or more columns: what we 
might call macroscopic layout. 


Interscript provides for that purpose a comprehensive mechanism we shall name 
descriptive layout. Descriptive layout does not prohibits the use of styles. it would rather 
enforce their use too. however it allows for the specification of high level layout. Tags have 
been defined which symbolically represent the layout process. By placing those tags at 
appropriate places and specifying attributes values. a uSer may indicate to a formatting 
process how layout should be achieved. 
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All of those specifications appear as parameters of the Interscript pouring process. The 
Interscript metaphor for this process is that the document content, is poured into some liquid 
layout, resulting in a solid layout. Liquid layout basically serves as a template which guides 
the pouring process in its actions. The pouring process is naturally described by means of 
constructs expressed in the base language. The fundamental pattern for invoking a pouring 
process is: 


{ POURS 
POUR.template = {TEMPLATES -- template -- } 


contents to be poured } 


A template basically is a hierarchy of boxes. A box defines a rectangular area on which 
constraints apply to locate it relatively to other boxes. Assume a user wants a page layout as 
shown on figure 3. That page has a header at the top and a logo down the header. The 
content should be laid out in a right area on the page, leaving some place for margin 
comments that should be placed on the left. 


When the content is poured in the page, the pouring process must not pour any content 
into the heading or the logo box, neither does it pour text content into the margin comment 
box. This correct placement of data is ensured by the MOLD mechanism. 
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Figure 3. A page layout. 
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When a box is to receive data, it shows a MOLD tag accompanied by a label. The pouring 
process does not try to pour content into boxes that do not have a mold tag, it directly places 
them within the page box. Ifa box is a mold, then it looks in the node contents for some node 
with a matching label. All content portions with matching label are to be poured into that box. 


It may be the case that there is too much content to be poured to fit on a single page. A 
template may specify that it is a sequential template, an iterative or an alternative one. Ina 
sequential template, the pouring process will consider all boxes in the hierarchy sequentially. 
If a template is full, or no more matching content exists, it considers the next mold. An 
iterative template will repeat itself until all matching content has been poured. 


An alternative template specifies different possibilities for pouring the content, the layout 
process is responsible for choosing one. One possibility is to try all of them, and pick up the 
best by its own criteria. Another possibility is to have additive tags indicating when or how to 
select an alternative. For example, one might indicate a template to be used on screens, 
another one for paper. 


Templates may also be combined. Figure 4 shows an typical example, a paper like this 
one. The first page has a particular layout showing the title, authors, an abstract of the paper; 
down of the abstract starts the paper. All subsequent pages are in the same format, showing 
only a page heading and text. The template for that document is a sequential one. It contains 
the first page as a single box (a page is a particular box), next an iterative page template. 


Many others possibilities are offered by this descriptive layout process. We only 
described in this paper its general properties. 


5. Office Document Architecture 


Office Document Architecture is a standard elaborated within ISO to introduce a 
Standard in the data structures used for the digital representation of documents. The most 
particular property of ODA is that it does not fit in the traditional architecture schema: ODA 
defines simultaneously the logical structure and the layout structure of a document, i.e. it is 
both a logical and a page description format. 


The argument in favour of that unique representation seems to be that most editing 
systems have to manage both structures. The standard says (part 2 - page 75) 
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Figure 4. Two different page layout in a single template. 
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“In a text processing system with separate editing and formatting subsystems, the specific 
layout is created after any changes to the specific logical structure and content have been 
made. In a word processor type editor, small editing changes may be incorporated 
directly into existing specific layout structure after every command, without recreating 
the entire specific layout structure.” 


‘This issue is discussed in the conclusion. This section only tries to present the ODA 
formalism. Figure 5 shows the main constituents of a document. It has six parts, a document 
profile, a document style, a generic and a specific logical structure, a generic and a specific 
layout structure. 


A document may actually contain only one of the structure. This is indicated when 
transmitting in the document profile. The document profile also contain data related to the 
whole document: creation date, last alteration date, originators, status, etc. 


Generic and specific are to be interpreted respectively as class and instance. For example 
a generic structure named conference paper will describe the general structure and properties 
of a conference paper, as shown in the SGML section. A particular instance of a conference 
paper will be described by its specific structure. Attributes defined in the generic will be 
valued in the specific structure, possibly to a default value specified in the generic part. The 
specific structure should be consistent with the generic one, and will probaly inherit 
properties from the generic structure. Specific logical and layout structure are trees whose 
leaf nodes are named basic objects and other nodes composite objects. Any node may carry 
attributes. 


The specific logical structure expresses the structure of the document in, e.g. paragraphs, 
chapters, titles, etc. The specific layout structure is a tree of page sets (a set of pages identified 
as a Single entity), pages, frames and blocks. Blocks and frames are rectangular areas located 
within frames and pages. 


Blocks and basic logical objects both refer to the document content. This content is 
divided in content portions. A content portion is governed by a content architecture, which 
basically defines the content type (characters, images, graphics) and its encoding mechanism. 


The logical and layout structure are clearly not independent, they refer to the same 
document content and they may have reciprocal pointers. 
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Figure 5. ODA document components. 


Figure 6 (reproduced from the standard) shows the coexistence of the two structures 
implies particular constraints. A paragraph which spans over two pages has to be split into 
two content portions. 





Styles simply are a named set of attributes, which can be referenced from other 
components in the document. They divide in layout style and presentation style. A 
presentation style is attached to a basic object and depends upon the nature of that objects. | 
For characters. it would indicate font information. for images it would probably indicate 
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Figure 6. Simultaneous layout and logical structure. 


colors or half-toning. Layout style defines global style information. It can be referenced only 
from logical objects. 


The standard is somewhat fuzzy about generic structures. There is no clause devoted to 
the description of generic structures. while there is one for each specific structure. It says that 
object class description are used by the editing process to construct a specific logical structure 
but it does not say much about such descriptions. Part 3 of the standard. which describes the 
layout process, indicates how the generic layout structure should be used. hence give a clearer 
idea. 


The generic layout structure contains common content portions. for example logos or 


headings that should be used in many places in the document. It also serves as a guide for the 
layout process. 
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6. Conclusion 


We have focused in this paper on three representation formalisms considered as revisable 
form representations, namely SGML, ODA and Interscript. On these three formalisms, 
SGML and ODA have reached the status of ISO draft proposal, which means they will 
become definitive standards with very little modifications. 


SGML results from experience accumulated since more than ten years by current 
practice in the field of markup languages. The standard has a precise definition, which makes 
it possible to rigorously parse a document. Document markup tags may induce hierarchical 
structures expressing the logical content of a document. A simple binding system among 
entities has been introduced, which allows for cross referencing among entities. 


Knowing that SGML has been running on many machines, that high quality text books 
have been produced through an SGML system, A vendor commercializing a markup 
document system would probably better take SGML rather than inventing a new formalism. 


ODA did not follow the same standardization process as SGML. ODA is an attempt by 
an ISO working group, consisting mostly of text processing system vendors representatives, to 
define a standard before there are hundred of representation formalisms around the world 
that would not be compatible. Hence there is no current practice of ODA and it is only 
expected that most of new systems will use ODA. However there are a few objections to 
actually using ODA. 


The choice of the two coexisting layout and logical structures. might lead to 
implementation problems. The content portions which have to be split to satisfy layout 
constraints will have to be recollected when the layout is modified. 


Now that most printing device vendors have upgraded their machines to have a 
procedural page description language, the design of the layout structure in terms of frames 
and blocks looks old fashioned and contradictory with the fact that ODA claims to be a 
standard for future systems. 


One might fear too with ODA that vendors will actually offer “ODA subsets”. 
Particularly it is the case SGML can be considered as such a subset. The standard explicitly 
states, probabaly for reasons of compatibility among ISO standards, that an SGML document 
may be transmitted within an ODA document (part5-page 4): 


“Any subdocument within a (ODA) document may be represented either by descriptors 
and text units or by an SGML entity set. An SGML entity set is a self contained unit of 
SGML information. which is denoted but the term document in the SGML standard.” 
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A vendor may actually sell an SGML system as an ODA subset system. If each vendor 
offers a subset of ODA, it might be the case that all of those systems will be actually 
incompatible, which is not desirable for an interchange standard. This argument is naturally 
true for all standards, but ODA design makes it easier to have closed subsets. 


For example, it is possible to design an ODA editing system that would not take into 
consideration all of the layout part and restrict to the logical structures and styles. This editing 
system will output only documents with logical structures and styles. These documents may 
be interchanged using the ODA format. 


It is possible too, in terms of delimiting an ODA subset, to design a simple word 
processing machine that would use no logical structure at all to produce office documents 
with only a layout structure. But both of those ODA systems will not be able to interchange a 
single document. 


Morover, it seems from ODA complexity that a complete ODA system can hardly be 
implemented on a small word processing workstation. Implementors of these relatively small 
workstations, who are willing to manage documents with both logical and layout structure, 
will probably have to define a subset in order to maintain satisfying performance. 


Though SGML has been designed so that human beings may enter markup tags into a 
document, it might well be used as an internal representation for an editor that would not 
appear to the user as a markup system. Then the structuring possibilities offered by SGML 
may be used by the implementors to represent complex internal structures. producing 
equivalent facilities to those of ODA. Documents produced by such an editor could hardly be 
revised by humans from a standard terminal, but they could still be output with the high 
quality of an SGML formatting system. Thus a vendor who is willing to implement a 
document preparation system has to choose among two international standards. 


Interscript is not an international standard and it seems it will not become. The reason 
might be that Interscript design is too much a departure from their existing formalisms to be 
accepted by most vendors, who are mostly interested in standards. Remember that bitmap 
displays were developed at Xerox PARC in 1975. In 1985. still very few vendors offer text 
processing systems with a bitmap display and a pointing device. Interscript was also born at 
Xerox PARC in 1983 [Ayers & al], as a result of several years of experience with powerful text 
processing systems running on bitmap displays... 


Yet Interscript will not be a standard in the eighties. it has introduced two important 
ideas, the notions-of base language associated with an internalization process. and descriptive 
layout, which should be retained by people who are participating to the design of a new 
generation of document preparation systems. 
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Interscript proves that a base language can be defined which encompasses all abstractions 
that can be found in the document preparation world. It can describe as well a document 
logical structure, properties and structure of various kind of entities (font, paragraph, etc), and 
functional symbolisms like the pouring operation. 


A base language considerably simplifies the software development of systems once it is 
implemented, but all over it gives cleanness to the systems and clarity in concepts. The 
Interscript base language is certainly not perfect. [t can be improved and it might be actually 
too powerful for its goals. 


Similarly, the idea of a layout process formally described and specified by abstract 
constructs, can be expressed in other terms than the particular Interscript pouring process. 
But both concepts have opened a direction for present research in the field. 
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Electronic Sources of Illustrations 


Maureen Stone 
Xerox Palo Alto Research Center 


1. Introduction 


[In this paper we will discuss the common sources of electronically produced illustration 
material. This is not intended to be a complete analysis of commercially available systems. It 
is an attempt to characterize these systems for the potential designer or expert user of software 
for illustrations. This analysis will focus on four areas: the user model, computer graphics 
techniques, hardware/system requirements, and graphic arts quality. 


Graphic arts quality means that we want to judge the output of the system by the existing 
standards of the graphic arts industry. The image should be interesting independent of its 
origin as a digitally produced illustration. This means not having unintended jaggies, having 
sophisticated, well produced fonts, smooth lines in a variety of weights with smooth or 
mitered corners as appropriate, elegant arrowheads and symbols, a tasteful use of colors and 
textures, and commercial quality reproduction of continuous tone images. Page description 
languages have been designed to describe two dimensional illustrations such that these quality 
issues can be addressed on a wide range of device. Our analysis will relate the semantics of 
the image description presented by the illustration system to that provided by a page 
description language. 


Our analysis groups commercially available systems used for illustration into five areas: 
bitmap/pixel painting, drafting/CAD (computer aided design) systems, geometric illustration 
systems, business graphics, and full color prepress systems. These areas are not exclusive: 
some products span multiple areas, and the trend towards integrated software systems is 
blurring distinctions even more. Our taxonomy illuminates, however, the relative strengths 
and weaknesses of the different paradigms for producing illustrations. 
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2. Including Illustrations in a Document 


This course is concerned with documentation, so the only pictures that are interesting 
here are those that can be included in a document. The minimum requirement for including 
the output of a picture making system in a document is that the image must be expressible in 
a stand alone manner. Page description languages provide the ideal representation for this 
purpose. When creating images with an interactive computer system, a primary consideration 
is whether it is possible to aesthetically translate the screen image to the print media. The 
keyword here is aesthetically. Once on a printed page. we adapt the quality definitions of the 
print media, whose characteristics are very different from the display media's. Text and 
shapes rendered at screen resolution on a printed page look unacceptably jaggy. Textures 
suitable for displays look coarse on paper. Printers typically have a more limited number of 
gray levels and colors. Many vivid monitor colors cannot be reproduced on reflective media 
at all and approximating these colors may dramatically change the impact of the picture. 


The minimum page layout system applies a simple paste-up model to positioning figures. 
The figures need only be reproducible, though it is helpful if each includes its own size 
(bounding area). To meet layout requirements, it may be necessary to scale or crop the image, 
which translates in computer graphics terms to translation, scaling, and clipping. The next 
step up in layout flexibility is to be able to replace or remove elements. For example, 
commercial layout systems that accept CAD produced figures often include a way to replace 
the plotter-oriented text produced by the CAD system with typeset labels. The most flexible 
layout system would allow full editing of illustrations in place on the page. Commercially 
available systems do combine formatted text and graphics [8] but with limitations on the 
complexity of the final page image. | 


The combined illustrations in a document should all be stylistically similar and 
coordinated with the text. Style elements include text and symbol fonts, line weights, color, 
and texture. Line weights, dash styles, and arrowheads should be the same from figure to 
figure and should harmonize with the style of the document. Color coding should be 
uniform, color schemes should not clash and should be suitable for the final output device. It 
has been suggested [2] that these style elements could be separated from the geometric or 
schematic definition of the figures to enforce consistent style and allow for uniform media 
dependent adjustments. Such a separation would make it possible to combine illustrations 
from disparate sources by adjusting only the style parameters. However, to our knowledge, 
no commercially available systems support the notion of graphical style. 


The definition of the ideal integrated text and graphics system for publication 
applications is still a research area: tt must combine all the best features of graphics and text 
editors with a layout editor, a database for reference libraries. clip art, and revision history. be 
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extensible to include new features as required, and have a well designed user interface, 
suitable for the publication professional. | 


3. Areas of Comparison 


As stated in the introduction, these notes will focus on four general areas when 
describing illustration sources. The first of these is the user model, the model the system 
presents for building up a picture. Systems typically treat the image as either an array of 
pixels or as a collection of geometric shapes. The user can manipulate this model with a set of 
interactive tools. The second area will be to list some of the computer graphics techniques that 
are part of implementing a particular type of system. The list of techniques will not be 
exhaustive but should include most of those specific to the type of system under discussion, 
particularly those that may not be obviously difficult. The third area will be a general 
discussion of the hardware and system requirements for a particular application. This 
discussion is meant to be relative rather than absolute. For example, most geometric systems 
require more computing power than do painting systems. Finally, we will discuss how 
suitable the system is for meeting the graphic arts quality issues discussed in the introduction. 


4. Common Issues 


There are some basic issues that are common to all the systems analysed so we will 
discuss them here. These topics are resolution, gray-scale and color reproduction, and graphic 
arts quality typography. 


Resolution 


The output devices commonly used in preparing documentation graphics are monitors 
and digital printers. Resolution is a measure of the number and/or. density of picture 
elements in a raster device. A resolution independent representation of an illustration is one 
_ that makes it possible to render an image at any resolution. achieving the best possible result 
for that resolution. The easiest way to achieve resolution independence is to use a geometric 
representation which can be smoothly scan converted to an arbitrary resolution as the basis 
for the illustration. Pixel oriented illustrations are difficult to translate to paper without 
visible aliasing effects or jaggies. 


Normally. an illustration will be designed on a monitor and ultimately output to a 
printer. A typical color monitor resoiution is 640 by 480 pixels. and each pixel can have up to 
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256 intensity levels for each primary color. Black and white displays are often higher 
resolution, but still on the order of 1000 pixels across. The spacing of the pixels depends on 
the size of the monitor tube, but 72 spots to the inch (spi) is a typical resolution for a 
black/white display. Digital printers have a wide range of resolutions up to 2000 spots to the 
inch but they are bi-level devices. Typical resolutions for dot matrix printers are around 100 
spi or less, for laser printers around 300 spi, for photographic image setters about 1000 spi, 
and for prepress quality film plotters up to 2000 spi. 


In summary, monitors are low resolution but may have intensity levels whereas graphic 
arts quality printers are high resolution but must simulate intensity levels with patterns 
modeled after the halftoning techniques developed for the offset printing industry. Line art 
must be anti-aliased to appear smooth on a monitor, but will appear smooth at resolutions of 
300 spi and greater when printed. Text must be specially crafted to be readable at monitor 
resolutions but can be rendered from outline representations (with care) at 300 spi and easily 
at 600 spi and up. In general, higher resolution devices produce higher quality images and 
cost more. Illustration systems must balance the desire for high resolution with the overall 
cost of the system. 


Gray-scale and Color Reproduction 


Intensity variations on a color monitor are achieved by exciting the monitor phosphor to 
produce different amounts of light. Intensity variation on a digital printer is achieved by the 
use of patterns called halftones, as in conventional offset printing [7.16]. The technique of 
halftoning trades spatial resolution for intensity levels as shown in Figure 1. The number of 
gray levels obtained by this technique is a function of the number of dots simulating the 
halftone pattern. Since the printer dot size is fixed, there is a tradeoff between edge sharpness 
and the number of gray levels—a higher resolution dot is bigger. Commercial printing 
typically uses halftone frequencies of 60 to 150 lines per inch. To simulate this on a digital 
printer and maintain approximately lOO gray levels, we need printer resolutions of 
approximately 600 to 2000 lines per inch. 


One common problem with printing computer generated continuous tone images is the 
appearance of contour lines in smoothly shaded areas. These contour lines are caused by 
quantization in the printed rendering of shades of gray. This is caused by the limits of the 
resolution available to simulate the halftone screen. For example, using a 4 by 4 array of 
pixels to model a halftone dot produces a maximum of 17 gray levels. This is not adequate to 
produce the appearance of continuously changing shades of gray. This dramatic a limitation 
would cause artifacts in any example. but the problem exists even at higher resolutions. The 
problem is worse for computer generated images than for scanned images because scanned 
images have background noise level as an artifact of the scanning process which tends to mask 
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Figure 1: Digitally produced halftone patterns 


the contours whereas computer generated images are noise free. One solution is to add 
random noise to the image to break up the contour patterns. 


A color monitor produces colors when the electron beam stimulates red, green, and blue 
phosphor dots to produce a pattern of red, green. and blue luminous spots. The brightness of 
the dots is a function of the power supplied by the electron beam. The color system described 
is additive and the three primaries are independent. A color for a monitor. therefore. is 
specified as a red, green. blue triple, hereafter called an RGB value. 


A color printer produces color by overlaying spots of magenta. cyan. yellow. and 
(optionally) black inks on paper. Patterning inks on paper is an additive system, with the 
white light from the uncovered part of the paper adding to the colored light reflecting from 
the inked portion of the paper. Overlaying inks on paper is a subtractive color system. that is. 
the inks act as a set of filters so the color is determined by multiplying the transmission values 
of the inks together. Halftoning is a combination of these two color models. Mapping 
between the additive color model of a color monitor and the complex. non-linear color model 
of color printing is a difficult task. Some work in this area has been published recently 
[13.15]. A more general discussion of color in the digital graphic arts environment is available 
in a recent article [14]. which has been reproduced elsewhere in these notes. 
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Graphic Arts Quality Typography 


The problems of producing graphic arts quality text is discussed in detail elsewhere in 
these notes. It is sufficient to note here that fonts designed as rasters for displays will look 
terrible on paper. Furthermore, any small shape or symbol, such as an arrowhead, has the 
same rendering and representation problems as fonts. 


5. Bitmap/Pixel Painting 


Popular painting systems let the user change the color of each pixel on the display 
independently, using the model of an electronic paintbrush and paint. The output is an array 
of pixels, either black and white or color, and is typically limited to display resolution. Often, 
scanned images can be combined with freehand drawing to quickly produce complex 
pictures. Many full color painting systems are combined with video editing systems and are 
intended to produce video animation. Painting systems provide the most flexible way of 
producing images on a computer, but they are limited by the resolution of their 
representation. They are easy to learn and use. The majority of computer art and computer 
illustration systems are painting systems. 


Computer Graphics Techniques 


Simple painting systems are easy to implement on any system with a raster display and a 
pointing device. The basic paint operation follows the motion of the pointing device, drawing 
a rectangle or other simple brush shape at every new position. Some care must be taken, 
however, to provide smooth motion. The input from the pointing device should be filtered to 
remove duplicate points and jitter. If a continuous line is desired, it may be necessary to 
interpolate the stroke between input positions. The brush image is typically stored as a bit 
pattern which is used to mask the application of the paint. A fast, two dimensional bit 
operation such as RasterOp [9] is very useful for the time-critical inner loop of the painting 
operation. 


Bitmap painting systems use textures instead of paint colors. Display textures must be 
carefully designed to avoid flicker. When painting or filling an area the phase of the texture 
must be defined relative to some fixed grid such as the absolute position of the pointing 
device on the screen. so that overlaid areas of the same texture blend together correctly. 
Textures can either be opaque black and white or transparent. where only the black is written 
into the image (Figure 2). 
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Opaque textures Tranparent textures 
Figure 2: Opaque and transparent texture patterns. 


Color painting systems usually provide a way for users to mix custom colors. These 
colors can be opaque or translucent. Translucent colors add their value into the existing 
image and so follow the rules for additive color systems. To achieve soft edges and other 
subtle effects, a variety of spatial weighting functions can be used to mix the paint with the 
underlying image, simulating a mechanical painting technique called airbrushing. 


Most painting systems include features beyond the basic painting paradigm. Area fill by 
“flooding” a region of a single color with another is a common addition. Flood-fill algorithms 
are discussed in standard graphics textbooks [5]. A system aspect of using flood-fill 
algorithms is the importance of helping the user guarantee that the region specified is 
contained, or that its borders have no hole where the paint can “leak out.” Another common 
addition is structured brushes that produce straight lines, boxes, and circles by specifying a 
few design points. Once the shape has been displayed, the structure is lost. A special case of 
structured brushes is text. Most painting systems have some way to type or paste text labels 
into the picture. | 


Hardware/System Requirements 


The minimum system requirements are an all points addressable raster display, a 
pointing device such as a mouse. and a processor with sufficient power to provide smooth 
interaction during the painting operation. The deluxe system has a full color monitor. image 
processing capabilities. and significant storage capacity as each image will take a quarter of a 
megabyte or more to store. Most personal computer systems support some form of painting 
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program. Output is typically intended to be viewed on the monitor, although a video tape 
recorder would be required for video animation systems. 


Graphic Arts Quality Issues 


The principal limitation with painting system is the unstructured, display oriented 
output. Black and white images are unacceptably jaggy. The illustration can only be 
manipulated as an array of bits, independent of the visual structure. Colors and textures are 
monitor oriented and may not reproduce well on the printed page. Furthermore. the overall 
appearance of an illuminated display is qualitatively different than that of a reflective print. 
The artist using a display to design for print media must be aware of the effect of the 
differences in brightness, contrast, and resolution. 


The resolution limitations can be relieved somewhat by providing subpixel zooming, that 
is, the ability to paint at a higher resolution than that of the the display. This does not provide 
any additional structure, however. Some research has been done to define shapes from their 
raster representation [10,11] but these algorithms are typically expensive and not 100% 
reliable. 


6. Drafting/CAD Systems 


The use of computers to aid mechanical design is well established. Drafting systems 
provide tools for making accurate orthogonal or isomorphic projections of three dimensional 
objects. Many design systems will also produce a simulated three dimensional view of an 
object. The principal goal of a CAD system, however, is to produce a mechanically accurate 
model represented either by its surface geometry or by its solid geometry. The user model is 
one of building up an object description, and the user interface contains many features that 
are directed towards the accuracy of the model rather than its appearance. It is often 
desirable, however, to use these drawings as illustrations, especially in technical publishing 
environments where the drawing is needed as part of the documentation. The problem is to 
extract a description of the model strictly in rendering terms such as a page description 
language. It typically takes many months to learn to use CAD systems effectively. 


Computer Graphics Techniques 
Most mechanical designs are designed and displayed as line drawings, so most CAD 
systems need fast line drawing techniques. Originally. these systems used vector displays. 


For systems with raster displays, there is often an accelerator for lines in microcode or 
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hardware. Lines need to be displayed in different weights, colors, and styles (such as solid, 
dashed and dotted). Straight lines and circular arcs are the principal geometric forms, 
although certain industries require the use of spline curves. A real mechanical design is large 
and detailed, so fast pan and zoom are common features. 


The user interface must provide techniques for precise construction, such as gravity to 
snap the cursor to lines, points, and intersections; the ability to add numeric information; or 
compass and protractor equivalents for common geometric constructions such as parallel or 
perpendicular lines. A general purpose shape intersection algorithm is needed to support this 
user interface. 


The appearance of mechanical drawings is very standardized. Typical CAD systems help 
the user maintain these standards by providing special routines for dimensioning, 
crosshatching, symbols, and patterns. The appearance of dimensions is particularly well 
controlled. Labeling, too, is standardized, controlling the size and orientation of text. While 
not completely freeform, text must be displayed at arbitrary angles and in a range of sizes. 
Fonts are often vector defined to simplify this transformation and to accommodate the use of 
plotters as output devices. 


The user of a drafting system is accustomed to producing three dimensional designs by 
operating on orthographic and oblique views. A good design system will support this process 
by providing simultaneous construction of orthographic views and automatic construction of 
oblique views from orthographic projections. Many systems can produce a nicely shaded 
rendering of a 3D view of a design. 


Hardware/System Requirements 


A drafting system should have a high resolution display and a digitizing device such as a 
tablet. A digitizing device differs from a pointing device in that it produces absolute numeric 
information rather than relative positions. While there are several personal computer based 
drafting systems, production CAD systems are often large, that is. both storage and processor 
intensive. Floating point support for geometric operations is essential. Most CAD systems - 
include a hardcopy output device capable of supporting large format drawings, usually either 
a pen plotter or a wide format electrostatic plotter. 


Graphic Arts Quality Issues 
CAD systems produce a geometric description of the image that is resolution 
independent. It is easy to apply geometric transformations such as scaling and clipping which 


are useful in page layout. Many CAD systems assume a pen plotter as the final output 
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medium. This, along with the industry standards, produces a particular style: lines in 
multiple colors and styles, stroke defined text, and cross hatching or textures rather than filled 
areas. Line widths are usually restricted to ones so narrow that joint and end conditions are 
not an issue. The most common improvement desired for CAD drawings is to replace the 
labels with typeset quality text. 


The three dimensional renderings from CAD systems are treated as images by page 
description languages. As mentioned above the smooth, uncluttered images produced by a 
CAD system are very susceptible to contouring artifacts unless added noise or some other 
image processing technique is used to mask this problem. 


7. Geometric Illustration Systems 


Geometric illustration systems are those which use a geometric model in an illustration 
tool. The user constructs outlines with control points and menus or keyboard commands. 
The outlines can represent lines of different widths or can bound areas which are filled. The 
imaging model is 2 1/2 D, that is, flat with overlap. Geometric illustration systems are similar 
to 2D CAD systems in their approach to representing images but the goal of the system is to 
produce a pleasing illustration rather than building an accurate model. The intended user is a 
graphic designer who is possibly computer naive. However, even if the user interface is well 
presented, significant experience with system is usually required to produce effective 
illustrations. 


Computer Graphics Techniques 


Geometric illustration systems share many techniques with CAD systems. However, 
special rendering and user interface problems arise because the goal of the designer is to 
control the visual aspects of the image. It is important to provide the most accurate rendering 
possible of lines, shapes, colors. and textures. The user interface must provide accurate 
control but should not interfere with the design process. The hidden structure produced by 
the construction methods should harmonize with the visual structure. 


Shapes in a geometric illustration system can be bounded by lines and curves of various 
forms —arcs, conics, and cubics— which can be filled or outlined. An efficient and accurate 
scan conversion routine that renders all these shapes is a basic requirement for these systems. 
Wide line shapes provide a particularly challenging problem. While it is easy to compute 
lines of arbitrary width when they are straight or circular arcs, the algorithms for higher order 
curves such as conics and parametric cubics are much more difficult. Additional care must be 
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taken to correctly model the joints and end conditions on lines wide enough for these aspects 
to be visible. Producing an outline representation of the curve bounding a wide curved line is 
the topic of a recent PhD thesis [6]. 


The display should always reflect the current state of the design. For graphic design 
systems this can present a significant performance problem because of the complexity of the 
shapes to be rendered. A useful technique to minimize this problem is to localize the refresh, 
or repaint only the part of the display that has been affected by user action. The design of an 
efficient and effective localized refresh algorithm provides an interesting set of problems for 
the system designer [1]. 


If localized refresh is used, care must be taken that the scan conversion algorithms for 
shapes have well defined edge conditions to avoid artifacts along seams. Errors can also occur 
when bitmap operations are used to repaint part of the image. When a page is scrolled, for 
example, then the remaining part of the image is drawn from the original representation. 
Careful design of scan conversion and clipping algorithms will avoid these problems. 
Another type of solution ts to provide a special representation for each object; one that is 
already bound to the screen resolution. All display updates are driven from this 
representation, which is kept consistent by avoiding incremental changes. 


Scan conversion precision at the pixel level is even more critical if the system uses an 
XOR (exclusive OR) function to erase and paint parts of the scene. Pixels that are 
inadvertently written more than once may end up in the wrong state at the end of the refresh. 
This phenomenon will leave little speckles across the screen, whimsically called “pixel dust.” 
The use of XOR is tempting, especially for interactive display techniques such as rubber- 
banding or dragging. However, we have found the difficulties outweigh the advantages for 
most graphic arts applications. Not only are there the scan conversion problems mentioned 
above but the visual presentation of inverted textures is often very distracting. Furthermore, 
XOR does not generalize to color displays. 


Selective operations on shapes require the target shapes to be se/ected in some manner. 
The preferred approach is to position the cursor on the desired object and click a button. This 
requires a fast hit-testing algorithm that will work on all shapes in the picture. A good 
approach [1] is to provide a suitable encoding that can be quickly intersected with a point. 
This same encoding can be used to provide efficient and accurate incremental refresh. 
Selection feedback is a difficult problem in a system where reserving any color or shape to 
indicate selection will eliminate the possibility of using that technique in an image. This 
problem becomes progressively more difficult as the amount of structure in the system 
increases. For example, it is important to distinguish between selected objects. selected parts 
of objects, and selected groups of objects. 
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(a) (b) (c) 





Figure 3. (a) Hole A overlaps shape B (b) Result of conventional scan conversion. (c) 
Alternative interpretation of users intention. 


The object definition in the page description language combines lines, arcs, conics, and 
cubics to define the outline of shapes and the holes in shapes. Construction of such complex 
shapes requires a sophisticated user interface for manipulating lines combined with free-form 
curves. Additional complexity is added by the use of holes in shapes. For example, if a 
closed curve is defined as the boundary of a hole, how does it behave when it overlaps the 
edge of its enclosing shape? Figure 3 shows two possible interpretations. 


When shapes are overlaid they visually combine. This combination may suggest a 
different structure than the one used to construct the shape. Figure 4 shows three possible 
Structures for a simple shape. This structure becomes important for rendering when outlines 
are added, and is always of interest when the user wants to modify the shape. Ideally, the user 
would be allowed to easily redefine the structure as desired. 


Hardware/System Requirements 


A geometric graphic design system must have a raster display and a pointing device. 
While such systems are available on personal computers, the complexity of the basic shapes 
and finished illustrations will be limited by the power of the processor. Many of the 
operations on curves require floating point arithmetic support. Since it is impossible to 
precisely duplicate the appearance of a design intended for paper on a display. ready access to 
a laser printer or similar printing device is desirable. 
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(a) (b) (c) 


Figure 4. Three possible constructions for a shape. 


Graphic Arts Quality Issues 


Since the model of an image used in a geometric design system is similar to that used in 
page description languages, the graphic arts quality issues of such a system are generally 
addressed. The ideal geometric design system would use precisely the semantics of the page 
description language for the ultimate output device. Such complete control, however, is only 
available in a programmable system. An interactive system will typically have some stylistic 
limitations which may need to be overcome when the illustration 1s added to a document. 
The use of graphical style would minimize the difficulty of this integration. 


8. Business Graphics 


While the majority of business graphics systems are simple chart producing programs. we 
include here a description of more sophisticated data-driven graphics applications. The user 
selects a template from a set of available formats, often called a chartbook. Data is entered 
either by hand or directly from another application. The final illustration is generated 
automatically. Facilities may be available to modify the finished image with either a painting 
system or a geometric editing system. While the emphasis in these systems is to allow a user 
with little drawing skills to quickly produce produce graphical representations of their data, 
some training in design aesthetics is necessary before effective illustrations are produced. 


Computer Graphics Techniques 


The most frequently used representations in business graphics are bar and pie charts. so a 
business graphics system must be able to render rectangles. circles. and circular arcs. The bars 
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and circles may be filled with a distinctive color or texture. The regions must be labeled, often 
with an associated arrow. For plotting scientific data, it is useful to add dashed lines in a 
variety of styles. The precise rendering of boxes and circles is not difficult, but as many 
personal computer chart programs have shown, it is possible to do it poorly. Bars should 
cleanly meet their axis. Circular segments in a pie charts should smoothly align to form a 
circle. High quality text, symbols, and arrowheads do present problems in rendering because 
typical display devices have inadequate resolution for traditional designs. Careful design of 
fonts and symbols to accommodate the low resolution devices typically used in this 
application would be the most powerful solution here. 


The principal algorithmic issues have to do with formatting an image from data and a 
template. A bar chart has a variable number of bars, the number and height of which are 
defined by the data. Given a template for style, rendering the chart is generally 
straightforward. Difficulties arise when the data will not fit correctly in the template, or if the 
style is inadequately defined. For example, what should the system do if a label is too long to 
fit in the available space? Many of these problems must be resolved by interaction with the 
user, which is an interesting user interface problem. 


Included elsewhere in these notes is a discussion by Bill Bowman on the topic of 
idiomatic illustration, or the process of defining visual idioms that can be modified to give a 
specific illustration. It describes not only the simple charts and graphs common to 
commercially available systems, but looks to the future to include a diagrams, maps, plans, 
and pictorials. | 


Hardware/System Requirements 


Simple charting programs can be implemented on almost any computer system that has 
access to a printer. It is useful, but not strictly necessary, to have a graphics display to preview 
the chart before it is printed. It is very important to have the program accept data from other 
business applications. The value of these programs is in their ability to produce graphical 
visualizations from real application data. Most business systems are small computers with low 
resolution dot matrix printers or pen plotters. As computer systems in businesses become 
more powerful. the output of these systems should become more sophisticated. 


[t is interesting to note that there is a trend in business presentations towards projecting 
the monitor image rather than making printed slides. This eliminates the problem of 
transferring the image from the monitor to paper and allows the designer to include 
animation as well as static images in the presentation. 
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Graphic Arts Quality Issues 


A recent issue of PC World reviewed 36 chart making programs, and that was only the 
subset they chose to review. Given a personal computer and a printer, it seems, anyone can 
write a charting program. Visually speaking, however, the output of most of these programs 
was terrible. It is possible to make simple graphics, even at low resolutions. aesthetically 
pleasing by choosing appropriate fonts, textures, and colors. And, as the sophistication of the 
output device is increased, the appearance of the output should improve. Some programs do 
keep a geometric rather than a raster representation of the design so that it is possible to 
produce presentation quality slides from the output. 


Automatically generated images will tend to have an inflexible style which may be an 
issue when including them with other images. To achieve the best quality it will generally be 
desirable to move the automatically produced illustration into a general purpose editing 
system to finish It. 


9. Full Color Prepress 


Digital prepress systems allow the user to digitally paste-up full color images with typeset 
text. The output of these systems is halftoned color separations suitable for offset printing. 
These systems have very little structure in the representation. Instead, they operate on very 
high resolution raster data. Much of the emphasis is on replacing the traditional darkroom 
techniques with digital ones. These include color balance, cropping. scaling. and matting. 
The user operates on full color scanned images. Regions of images can be selected. either by 
outlining or by color discrimination. The colors can be adjusted, the edges blurred, and 
regions can copied and blended into the existing image. Some of these techniques overlap 
those of full color painting systems. Others. such as color correction. are specific to the 
prepress industry. Commercial systems are complex and require many weeks of training 
before a user can operate them effectively. 


Computer Graphics Techniques 


The basic page make-up task involves selecting sections of scanned images and 
positioning them on a page. Text is typically typeset separately. then treated like the images 
for paste-up. That is, the operator can cut out blocks of text and position them. but not edit 
them. The minimum set of operations is rectangular cropping and simple transformations. 
Scaling and rotation are also useful and enable the user to produce more complex layouts. 
Page make-up is often performed on low resolution versions of the final images to improve 
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interaction time. The result of the page make-up task is a set of commands which can be 
repeated on the original data as a batch program. 


Each image in the page may require modifications. The most basic are tone adjustment 
or redistributing the bnghtness values to match those of the target output device. This 
operation improves overall brightness and contrast. Portions of the image may be “cut out” 
for further modification. The region of interest is indicated either by tracing along the 
boundary with the pointing device, or by indicating the background color to an algorithm that 
automatically generates the border between this color and the desired image. Such algorithms 
should have an option to operate only on the selected hue, ignoring the changes in lightness, 
to facilitate masking images on a shaded background. 


Two images may be combined by overlaying or abutting them. This will require some 
technique for blending the two images so no artificial edge is visible. One general solution to 
this problem has been published [4]. Another approach is to define soft or translucent edges 
on an image while it is being cut out of the original image. Experimental techniques have 
been developed to compute the alpha channel (transparency) component of the edge from the 
color definition and user hints. This information can be used to smoothly blend the edg 
into a new background [12]. | 


A major component of digital prepress systems is color correction. The colors in an 
image will be modified when scanned and again when printed. Manipulating these colors to 
get the desired result on the printed page currently requires significant experience and 
expertise. Traditional techniques involve manipulating the three or four color separations 
independently. Recently, some systems have been developed to allow the operator to 
manipulate colors on a color monitor, treating it as a proof copy of the printed page. These 
systems provide a quick way of visualizing color relationships and minimize the amount of 
adjustment performed at the separation stage of the production process. Effectively mapping 
colors from device to device is still an inadequately understood problem. Commercial 
systems that address this problem are carefully controlled and specialized to match their 
specific configuration. The details of these systems are often kept proprietary. 


Hardware/System Requirements 


Graphic arts quality images may be as large 4000 by 4000 by 24 bits, and several may be 
combined to form a page. For production systems, therefore. the most significant 
requirement is adequate storage space. System performance will be largely a function of the 
amount of local memory and the disk swapping speed. The usual source of input for these 
systems is a graphic arts quality scanner. These scanners operate at resolutions up to 2000 
lines per inch and can scan either reflective prints or transparencies. 
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If the system is used for interactive page make-up, a high resolution, good quality color 
display plus a pointing device is required. The display may be specially constructed and 
mounted to provide maximum control over the color quality, especially if the system is 
intended to simulate the color of the printed page. The principal output device for these 
systems is an output scanner that produces halftoned films. These plotters scan at resolutions 
between 1500 and 2000 spots/inch and typically contain the algorithms necessary to convert 
the intensity values in the image to halftone patterns. 


Graphic Arts Quality Issues 


Prepress systems maintain the highest possible quality while controlling the manipulation 
and reproduction of images. The principal limitation is the lack of structure in images that 
could be structured. All art, whether line or continuous tone, is scanned and manipulated 
uniformly. Text and line art may be kept at higher resolutions than the continuous tone 
images, but they are still rasters. Some commercial systems are beginning to include simple 
Structured graphics for rules and borders. Future systems may well combine general purpose 
structured graphics with scanned images. 


10. Conclusion 


In summary, the most significant factor in analyzing illustration systems is the underlying 
representation presented by the system. CAD systems and geometric illustration systems 
work on a geometric model of the image. Painting systems and digital prepress systems use a 
raster representation of the image. Business graphic systems may use either, but are more 
likely geometric in the design of the initial diagram. 


Systems with similar representations tend to share user interface concepts and will tend 
to merge in the future. For example. prepress systems are importing many of the techniques 
common to painting systems. Painting systems are using low cost video cameras to add 
scanned images to painted illustrations. CAD systems are being implemented on low cost 
personal computers and promoted as technical illustration systems. Research in geometric 
design systems is exploring adopting precision placement techniques common to CAD 
systems {3] for illustration design. There is a tend in business graphics systems to allow the 
user to customize the illustration using a painting system. This operation destroys whatever 
Structure was present in the illustration. Hopefully. as the market develops a demand for 
higher quality business graphics. this trend will be redirected towards merging business 
graphics with geometric design systems. 
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Raster oriented systems are more flexible and usually are more attractive to users, 
whereas geometric systems produce higher quality output and have potentially more power. 
In the long term, it should be possible to combine painting techniques with structured 
techniques in a single system to produce a significantly enhanced set of functions for 
producing illustrations. 
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Idiomatic illustrators 


William Bowman and Robert Flegal 
User Sciences Group 
Xerox Palo Alto Research Center 


November 1975 


Introduction 


This report describes an effort to design and implement a set of computer-based graphic 
tools that enable people, unskilled in either Graphic Arts or Computer Science, to easily 
illustrate technical ideas and information. The basic notion explored was: is it possible to 
break down the world of technical graphics into ‘idioms’ (constrained environments) such that 
the computer could provide both mechanical and aesthetic aid to the non-professional user. 


In order to test this concept, we divided technical graphics into four basic environments: 
1. quantitative 
2. ideographic 
3. isomorphic 


4. volumetric. 


Each of these basic environments was then further subdivided into graphic ‘idioms’. For 
example, piecharts and barcharts are quantitative idioms while exploded views and culaways 
are examples of volumetric idioms. 


From the wide spectrum of possible idioms we chose to examine three of them: a 
typographic idiom, block diagrams. and piecharts. This report is primarily devoted to a 
description of the ‘idiomatic approach to computer graphics as we experienced it within the 
context of working with these three idioms. 
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1. Idiomatic Illustrators 
Objectives 


The aim of this project was to provide Alto-based graphics tools that would enable 
people unskilled in either computer science or the graphic arts to easily construct articulate 
graphic statements. This was a six-month project, begun in February 1975 and concluded in 
August 1975. 


Method of Approach 


We conceived a research plan for creating a series of special-purpose subsystems, called 
illustrators, to deal with graphic problems on a specific rather than a general level. The design 
of these special-purpose illustrators was driven by an attempt to conform to conventional 
notions about graphic ‘idioms’ which are commonly understood and used in the working 
world. To establish a comprehensive frame of reference for this approach we reviewed a wide 
variety of illustrations, and constructed a graphic mural (reproduced on the following page) 
which represented four basic graphic environments: 


l. quantitative 
2. 1deographic 
3. isomorphic 


4. volumetric. 


Quantitative figures dealt with visual translations of numerical data. Ideographic figures 
symbolized conceptual information. Isomorphic figures communicated through abbreviated 
versions of real forms. Volumetric figures represented objects as they appear or might appear. 
In each environment we subdivided illustration types in terms of communicative aim, and 
displayed various particular occasions of each aim. Each of these specific aims, along with its 
associated occasions, we called an ‘idiom’; and it is on this basis that we built the idiomatic 
illustrator project. 


The reason for choosing the idiomatic approach is that one does not need to have the 
whole world of graphic language at one’s command to create a bar chart (a graphic ‘idiom’): 
all one needs is some bars, a scale, and some labels. By the same token, if one would rather 
make a pie chart (another graphic ‘idiom’) one doesn’t need bars and scales: one needs a circle 
and some dividing lines. Applying this approach, the barchart program would only draw bars 
and the piechart program would only draw pies. Too constrained? Not for the unskilled user 
who simply wants a bar chart now without having to master the illustrator’s bag of tricks. both 
technical and aesthetic. For the unskilled user. constraint means support: the user enters the 
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graphic world at an idiomatic level, and so can deal with his/her ideas using the specific 
secondary forms which represent them (scales, bars, pies, etc.) rather than the more general 
primary form vocabulary (line, shape, texture, etc.) of the professional illustrator. 
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Scope 


From the range of possible idiomatic illustrators we chose to work with three: 


SIGN —A typographic program for simulating letraset type in making headlines, poster- 
notices, view-graphics, etc. (This idiom was not represented in the graphic mural.) 


BLOCK—An illustrator program for making block diagrams, organization charts, 
process charts, etc. 


PIE— A program for visualizing tabular data automatically in the form of a pie chart. 


SIGN was chosen because of its simplicity and because it was needed by the PARC video 
communications group to make titles for their videotapes. This meant a set of real users with 
whom we could try out our ideas. BLOCK was chosen because of its potential value to PARC 
aS a communication tool, and because it offered us an opportunity to deal with the basic 
graphical problem of form and space interaction. PIE was selected so that we could get some 
experience with an automatic table-driven illustrator. 


All of these programs were written in SMALLTALK, with much help from people in 
LRG. The following three sections of this report describe in detail the basic features of 
SIGN, BLOCK, and PIE. The last section presents research conclusions drawn from this 
project. 


2. The Sign Program 


SIGN is a modest typographic program originally designed to produce hard copy text 
titles for use in PARC’s videotape projects, but it is equally useful for creating bulletin-board 
notices, small posters, identification labels, view-graphs, and other kinds of ‘social-style’ office 
communications. SIGN distinguishes itself from other text systems in that it is 
environmental: that is to say, it can be used to create word ‘pictures’ that catch the eye in the 
physical world of competing visual objects, such as the PARC office scene. 


The basic design criteria for SIGN were: 


1. A minimum 24 point font size, bold, and sans serif to insure readability in the video 
medium. A 24 point helvetica bold face was chosen. 


2. Exact compositional control on the Alto screen and identical hard copy by SLOT — 
so that what you see is what you get. 


3. A simple operating procedure that enables people not skilled in computer science or 
the graphic arts to create professional headline text a-la-letraset. 
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SIGN is also a step toward solving the graphical problems associated with text headings. 
Currently, it lacks a coherent scheme for dealing with margin justification, color, changeable 
leading, inter-character spacing, etc. Much interesting design remains to be done in this area. 


Two details about SIGN deserve mention: the spatial gridding and the ease with which a 
user can obtain hard copy. Vertical gridding is always enforced between lines. There is a grid 
of % of the inter-line spacing in the horizontal direction when a line of text is first specified. 
This aids centering along a vertical guideline. After the initial placement of a line of text, the 
horizontal gridding is relaxed. This allows for subsequent margin justification. The output is 
obtained through the use of command files (lots of crocks) which eventually send a press- 
format file of the screen image to LPT. The important point about output is that the program 
owes much of its popularity to the ease with which one can obtain it... with the ‘push of a 
button’. 


The command language for SIGN is menu-driven and ‘modeless’. The menu itself looks 
like a Sign so as to relate aesthetically with the text on the screen. The SIGN program is the 
most complete of the illustrators discussed in this report; the design criteria were met, and the 
program has found much use. SIGN is a particularly interesting idiom in that it lies 
graphically between ‘run-on’ text and illustration. Simply stated. SIGN deals with text 
pictorially. This is a very common procedure in the graphic design world in the production of 
headline materials for magazines, books. brochures, etc. 


The following examples illustrate some possible uses for SIGN. 


REMINDER 


SSL Minera 
OF XIP VIDEOTAPE 


WEDNESD 
APRIL 23 30 | PM 


SECOND FLOOR COMMONS ROOM 
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LRG STUDENT SCHEDULE 


DAY TIME NUMBER 
MON 9:00 - 11:30 o) 
TUE 3:00 - 4:30 3 
WED ------- -~ 
THU 1:30 - 3:00 10 


SEMINAR 
BY: ROBERT KHAN 
TITLE: "PA 


TIME: OO AM 


PLACE: CSL COMMONS ROOM 
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SIGN: Summary Evaluation 


Videotape title applications are very successful, and the program is now used 
regularly for that purpose by PARC’s video communication group. 


Totally inexperienced users in the video group were able to operate the program 
immediately, as were secretaries, researchers, and others in the PARC community. 


The volume of general (non-video) office applications has been much larger than we 
expected, and has proved the program to be a useful multi-purpose workhorse. A 
dribble-file associated with the program has recorded this volume of use. 


SIGN’s single-font (caps only) capability is far too limited for most practical 
applications. Currently, we have no easy answer for this deficiency. 


The program is essentially an elegant hack, and consequently some users have 
experienced frustrating breakdowns. 


The move function is still crude, and offers inadequate support for the variety of 
alignment and spacing situations which commonly occur in graphic design. 


Conceptually, SIGN offers considerable promise as a headlining device for graphic 
design work, particularly in the areas of magazine, book, and brochure production. 
The main reason for this is that it treats word forms as graphical objects, and 
consequently relates to the graphic designer's methodology. 


3. The Block Program 


BLOCK is designed to deal with graphic problems in the idiom of block diagrams: 
including organization charts, process charts. and other rectilinear figures. This program is a 
specialist. It does not attempt to take on the whole world of graphic needs. although a few 
interesting by-products such as 3-D perspective are possible. BLOCK takes a view of graphic 
language that emphasizes design grammar (spatial dynamics. composition. etc.) rather than 
form vocabulary (gray scale. sophisticated detail. etc.). It is intended to help the ordinary 
(non-illustrator) user construct an articulate graphic figure without having to learn the 
illustrator’s profession. Basic aesthetics as well as manual skills are supplied by the program. 


The design criteria for BLOCK were: 


l. 





A basic form vocabulary of lines and rectangles for building the structural elements 
necessary to block diagrams. Secondary requirements included word and arrow 
forms. | 


A spatial grammar for composing form elements on the Alto picture plane with 
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| respect to aesthetics of planar design (visual relationship and differentiation). 





3. A capability for visual editing, including move and copy functions. Later, an area 
| , move/copy function was added to the criteria. 


4. A set of graphic processing utilities, including such functions as clean (refresh), file 
(save and get), print (XGP), and reset. 


| BLOCK, like SIGN, has been developed to the point that it is a usable SMALLTALK 
subsystem for making illustrations. The essence of the BLOCK program lies in its gridding 
scheme which spatially organizes its graphical forms (box, line, arrow, text) in an aesthetically 
related manner. During the design of BLOCK it became clear that no existing font was 
Suitable for diagrammatic purposes, so we designed and executed a new font. The design 
criteria for the font (BLOCKFONT) were: 





| 1. That it be a condensed font to maximize horizontal space on the Alto screen, which 

| is a major constraint in making diagrams. 

| 2. That it have the smallest bold (2-bit thick) face possible on the Alto screen, and still 
remain readable. 


| 3. That the font relate aesthetically to the rectilinear forms generated with the BLOCK 
| program. 


| First, an Alto font satisfying these criteria was designed. Its dimensions are 6 x 10. 
Subsequently, a coordinated spline outline version was constructed. This font should find 
wide usage in PARC terminal displays where horizontal space is at a premium. 


| 
| 
The command language for BLOCK is menu-driven and ‘modeless’. The thirteen 
commands are divided into four logical groups: 
1. form vocabulary (box, line. arrow, text) 
2. space control (grid module) 
| 3. editing functions (area, move copy) 
| | 4. memory commands (print, save, get, reset, clean) 
| 
The menu itself is presented in the form of a block diagram for (1) aesthetic relevance 
and (2) to enable the visual presence of a number of command options without creating a 
sense of visual confusion. 
| Individual command functions for BLOCK are as follows: 


| BOX: Draws boxes, any size or shape. The command requires two mouse inputs: upper 
left and lower right box coordinates. The box corners are positioned at the nearest 
points on a 32-unit grid. This aligns boxes automatically. provides consistent 
Spacing, and allows the user to be rough in his/her manual command executions. 
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LINE: Draws lines, any length or direction. The command requires two mouse inputs: 
beginning and ending points of the line. The line endpoints are positioned at the 
nearest points on a 16-unit grid. Because of the grid, lines will automatically split 
Spaces between boxes, and provide centering and exact box contact when used as 
connecting links. In addition, lines built at nght angles to each other automatically 
form a perfect corner. Again, the user may be somewhat rough in manual execution 
without problem. 


ARROW: Draws lines with arrowheads attached to the point designated by the second 
mouse input. Arrow lines may be any length, vertically or horizontally. {n all other 
respects this command functions like line. 


TEXT: Prints a line of text as objects anywhere in the figure, an 8-unit grid. The text 
automatically centers itself within boxes. Inputs are typed sequences (terminated 
with carriage return); and mouse points (center location for text). 


MOVE: Moves any of the above objects anywhere in the image, in terms of its assigned 
grid. Move can also be used to dump unwanted objects into the garbage can at the 
bottom right of the screen, causing them to disappear. Two mouse inputs are 
required, corresponding to old and new locations. 


COPY: Copies objects anywhere in the image. in terms of assigned gridding. Like 
move, two mouse inputs are required, to indicate form selected and the desired 
position of its copy. | 


AREA: Selects a form rather than an object, for moving or copying. As in box. two 
mouse inputs are required to indicate upper left and lower right corners of the 
rectangular area selected. In addition, third and fourth mouse inputs are required 
corresponding to old and new locations for the forms included within the 
rectangular area selected. The rectangular area selected will then be moved or 
copied in the new location. 


GRID: Permits the user to change the assigned grid spacing for any particular form. 


PRINT: Creates an XGP file for hard copy. Input is a filename (one word terminated 
with line-feed). The file created may then be transmitted to a NOVA with an XGP, 
and the command *“XPLOT filename’ given to the NOVA operating system. 


CLEAN: Refreshes the entire image, restoring forms damaged by moving. etc. 
SAVE: Allows images to be saved for future display. printing or modification. 
GET: Allows previously saved images to be recalled. 


RESET: Erases entire screen and restarts the BLOCK program. 


In addition to menu commands. line weight for any form may be controlled by the 


mouse button pushed: 
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l. top button: fine line 
2. middle button: medium line 
3. bottom button: heavy line. 


We have included a group of illustrations which describe BLOCK’s range of capabilities, 
and suggest how the program might be used. 
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BLOCK: Summary Evaluation 


10. 


The most successful aspect of this program is its spatial control of form. The notion 
of ‘invisible’ gridding as a strategy for the management of form/space interaction 
(design grammar) worked well, and has since been used with equal success by other 
programs at PARC (e.g., MARKUP). 


The simplicity of BLOCK has enabled many (graphically) inexperienced users to 
construct effective block diagrams. However, it is also clear from the work done that 
BLOCK does not ‘do it all’ as we had hoped, and that some elementary graphic 
skills are still required. | 


The BLOCKFONT worked well as a conserver of horizontal space, and competes 
well in the context of diagrammatic form. 


Area move and copy functions are still difficult to control, and require too much 
visual editing. The displacement for all objects within the area is gridded according 
to the current grid setting for text objects (usually the smallest). 


The concept of a fixed push-button graphic menu was, as in TAPE, felt to be an 
improvement over keyboard-oriented command systems. By the same token, it now 
appears that MARKUP’s spatially-flexible menu system and TOOLBOX’s keyset 
control system are much easier to operate than BLOCK ’s fixed menu. 


BLOCK lets the user know where his/her cursor is in relation to the ‘invisible’ grid 
spacing by moving the cursor to the nearest grid point (according to the form being 
created) when the mouse button is depressed. As long as the button remains 
depressed the cursor “hops” from grid point to grid point when the mouse is moved, 
and a point is specified when the mouse button is released. 


It is a demonstrable fact that infinite variations on the ‘block diagram’ theme can be 
created with relative ease using this program. However, exact/y where BLOCK ends 
and FLOW, or PERT. etc., begin is not yet clear. Further exploration with other 
related idioms would help to answer this question. 


Some fairly sophisticated illustrations can be created through line vocabulary alone. 
Mouse buttons work well as a tactical means for controlling line weight. 


In an illustrator context, it helps to be able to deal with words as graphic form 
objects (like lines or boxes). 
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4. The PIE Program 


PIE is an experimental effort to create an ‘automatic illustrator’; that is, a program that 
puts the ‘illustrator’ entirely within the machine and thus allows the user to get a professional- 
level illustration without having to perform any graphical tasks. The graphic idiom of pie 
charts was chosen for this experiment because, as a data-based idiom, it lends itself naturally 
to mechanical graphic translation. The basic graphic design decisions in making a pie chart 
are quantitative: not only the spatial division of the pie into its component segments, but also 
the placement of labels in relation to the available space resulting from those segments. 
Therefore, all that PIE requires of the user is a table of items (labels) and their associated 
numerical values (segments). The program (1) makes the pie, (2) translates the numbers into 
percent values and cuts the pie into corresponding pieces, and (3) attaches item labels to the 
segments. User interaction takes place entirely within the context of creating and/or editing 
the tabular data, a familiar and ordinary office activity. 


Basic design criteria for PIE were: 

l. A form vocabulary comprised of a single fixed-diameter circle. straight radial lines 
within that circle, and text labels. 

2. A spatial grammar that translates a set of numbers into degree equivalents, and 
represents those equivalents as pie segments using radial lines. 

3. Automatic/aesthetic label placement, with respect to shares and positions of pie 
segments. 

4. Asystem for tabular data entry that permits interactive user editing. 


In satisfying the design criteria for an automatic piechart-maker the most difficult 
problem was that of label placement. The strategy for this part of the program was as follows: 
If a pie segment had adequate size and/or an advantageous position for (horizontal) text, then 
(1) the label would be placed internally and generally centered within the available space. If 
the segment was small and/or vertically oriented. then (2) the label would be placed 
externally, and related to its segment by a connecting link. 


The space available for a text label within a slice of the pie was computed as follows: 


(a) point p is chosen so that it lies on the bisector of angle (@ + q)/2 and is located 
3/5R from the center of the circle. 
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(b) the four points S), Sz, $3, and S4 are computed by finding the intersections of the 
line y=x(Py + '’2h)/Px with lines 1), lp and the circle. (Note: h= font height) 

(c) next the four points t), tp, t3, and tg are computed by finding the 4 intersections of 
the line y= x(Py — '’2h)/Px with lines 1), 17 and the circle. 

(d) finally, ifs; and s; €{s}.S2,S3,s4} are the intersection that lie immediately to the left 
and right of Px and ty, t; €{t),tz,t3.t4} are defined similarly then the rectangle with 
upper left corner at max (Sj,tx) and lower left corner at min (Sj,t)) is the space that 
text may occupy and still be inside the slice defined by 6 and o. 


It should be noted that the space for text obtained by the method just described does not 
yield the maximum width rectangle that can lie in a segment of the pie. Originally we 
completed the maximum width rectangle that can lie in a segment. Placing the text centered 
in this rectangle caused graphical interference between the text and the radial lines which 
divide the pie and the links used where text could not fit inside the slice. Hence we chose the 
algorithm which, in general, produced a smaller space for the text but yielded a more 
aesthetically pleasing result. 


The system for external pielabels sought to maximize the number of possible labels that 
could be automatically arrayed around the pie, and at the same time make the most 
economical use of available space on the Alto screen. It appeared that a parabolic 
arrangement of labels around the pie produced the most efficient and manageable external 
pielabel system, as illustrated by the following design drawing: 
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The user interface for PIE is a simple table, into which the user types item names (for 
labels) and corresponding numerical quantities (for segments). As the user types in items and 
quantities the table expands downward. This table can be edited: items and quantities can be 
added, deleted, exchanged, or moved as desired. The order in which items are displayed 
corresponds to the order in which they are represented in the pie (starting at ‘noon’ and 
advancing clockwise). Thus. the user has control over the segment arrangement in his 
piechart. 





[t should be emphasized that unlike SIGN and BLOCK, PIE is still in an experimental 
State. and not yet ready for dependable work applications. However. we have tested the 
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program against a variety of data situations, and have essentially succeeded in satisfying the 
original design criteria established for the idiom. We can offer the following illustration, 
executed with PIE, as an example of the program’s current capability: 





PIE: Summary Evaluation 


l. For certain kinds of illustrations (particularly quantitative) automatic illustrator 
programs are quite possible. Essentially, PIE can produce a good pie chart without 
any user participation in the graphic process. Based on our experience with PIE, we 
believe that bar charts and curve graphs can also be produced in a more or less 
automatic fashion. 


2. Word and number labeling (because of its unpredictable length) is a serious 
problem for automatic illustrators, and as of now there appears to be no simple 
solution. 


3. Graphic execution time saved in PIE-like illustrators is enormous— much more than 
in SIGN or BLOCK. 

4. Creating and editing tabular data is in itself a graphically idiomatic process (quite 
aside from its application) and from our experience with PIE looks like a pregnant 
area for future research. 
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5. Conclusions 


These conclusions are an attempt to summarize our research findings in relation to 
BLOCK, SIGN, and PIE. We hope these conclusions will be helpful to others involved in the 
design of interactive picture-making systems. 


On Methodology 


We did not adopt the more common approach of specifying and implementing a 
graphics system and then writing the application programs. We rather scrounged whatever 
graphics capability was available (SMALLTALK) and began by simulating the illustrator’s 
habit of building up a ‘graphics language’ as we worked. We were able to do this because 
SMALLTALK already contained a rich set of graphics primitives. 


We began our investigation with three of the simplest and most commonly used idioms. 
Our reasons for this decision were twofold: first, about a dozen simple, well-known 
conventional idioms account for the bulk of technical graphics used in the working world, and 
secondly, it allowed us to concentrate on user issues such as command languages rather than 
on system issues that arise when dealing with complex pictorial representation. This approach 
drove out two insights that we might have missed had we adopted the more conventional 
approach that involves the development of a graphics system and then the design and 
implementation of the application programs. The insights are: (1) a very simple set of 
programming tools is sufficient for the development of most graphical idioms for general 
office use and (2) the user requirements in applications where the presentation is 
2-dimensional and dynamic are much more subtle and complex than we had imagined. We 
found ourselves designing form ‘processes’ rather than form ‘products’ through which to 
create pictures. Picture creation takes place in a human time continuum and the ‘rhythm’ of 
visualization is as important as the availability of form options. 


On Resources 


As mentioned above. one does not need much in the way of a graphics system to write 
useful application programs. The SMALLTALK picture manipulation and drawing 
primitives are quite sufficient. These include and enable: 


l. rectangles. points and grids— SPATIAL GRAMMAR 

2. lines of up to seven thicknesses - FORM VOCABULARY 
3. text strings — LITERAL IDENTIFICATION 

4. turtle delineation— GRAPHIC STATEMENT 
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For a complete description of this system, please refer to the SMALLTALK manual. 
On Project Results 


We feel that this project was a success in that we have demonstrated that it is possible to 
combine conventional graphic idioms and current computer technology to make it possible 
for ordinary (graphically unskilled) people to create articulate graphical statements. This has 
been demonstrated by various utilizations of the BLOCK program within the PARC 
community involving the creation of block diagrams. The simple compositional help that is 
offered by BLOCK greatly enhanced the aesthetic character of the user diagrams. The 
piechart program offers a powerful ‘machine tool’ for the person who wants to represent 
tabular data in visual form without having to actively engage in the techniques of technical 
illustration, or in this case, decisions of label placement. Evidence of SIGN’s utility can be 
found in PARC videotapes and on many PARC bulletin boards. 


On Future Research 


We have in the scope of this project only scratched the surface of the idiomatic illustrator 
concept. There remain many modifications to explore with BLOCK, PIE, and SIGN. For 
example, can one make the stages in picture specification like block-out and touch-up more 
explicit? This would offer the non-professional user even more help during the creation of 
his/her illustration. 


There also remain a host of other idioms to explore, such as barcharts, curve graphs, 
plans, maps, volumetric representations, etc. We believe an understanding of commonly 
understood graphic communication idioms in the context of a display-based interactive 
computing system will have large payoffs in office information systems of the future. For 
such systems (idiomatic illustrators) to be really useful they need to be integrated with a 
system that includes text. Research in this area is currently underway at PARC (Master- 
maker Project). 


Projecting even further into the future, text/graphics systems should allow for the 
personalization of graphic programs so that professionals in the fields of graphic design and 
illustration can incorporate the computer as an effective medium for visual communication. 
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IDIOMATIC MODEL FOR STAR GRAPHICS 


An Idiomatic Model for Star Graphics 


Long-Range Concepts and Facilities Outline 


William Bowman 


Nov. 20, 1980 


Introduction: Idiomatic Illustrators 


An idiomatic illustrator is a special-purpose 
graphics tool that enables the construction of a 
particular class of real-world illustrations. The 
illustrators for Star are curve graph, bar chart, pie 
chart, diagram, map, plan, and perspective. While 
each illustrator is conceived of as a self-contained 
illustration machine, they will all operate within the 
general graphics environment. In particular, all 
illustrators are composed of transfer symbols, 
making them graphic constructions. 


In terms of graphic language, an ‘idiom’ is an 
expression whose form structure and meaning are 
peculiar to a specific class of images. It is a 
constrained form/space environment which operates 
by certain conventional rules to produce certain 
visually communicative effects. Thus in Star the bar 
chart illustrator is a machine for constructing 
illustrations in the graphic idiom of bar charts. 


The purpose of the idiomatic illustrator is to 
enable Star users without professional graphic skills 
to create quality illustrations. Many of the graphical 
operations in conventional illustration are rational, 
measurable, and repetitive, and can be successfully 
represented in a machine system which does most of 
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the actual graphic construction. Idiomatic 
illustrators are designed to provide the user with 
more specific drawing supports than are provided by 
the general graphics tools. The office graphics user 
does not need the whole world of graphic language 
at his command to create, say, a bar chart; all he 
needs are a scale and some bars and labels. On the 
other hand, if he would rather make a pie chart, he 
doesn’t need bars and scales; he needs only a circle 
and some dividing lines and labels. Thus in Star, the 
bar chart illustrator only draws bars, and the pie 
chart illustrator only draws pies. Too constrained? 
Not for the unskilled user who simply wants a bar 
chart now without first having to master the 
professional illustrator’s bag of tricks, both technical 
and esthetic. For the unskilled user, constraint 
means support. The user enters the graphics world at 
the application level, where he can represent his 
ideas and information using recognizable illustration 
elements (bars, scales, pie slices, etc.) rather than the 
more general form elements (lines, shapes, etc.) 
which are the conventional resources of the skilled 
graphics specialist. 


While most graphic design decisions are 
reserved for the user, the actual drawing of the 
image is intended to be as automated as possible. 
This (1) enables the user to produce illustrations 
which might otherwise exceed his manual skills, (2) 
relieves the user from the burden of ‘mindless’ 
repetitive graphic operations, (3) maximizes the 
technical quality and consistency of the resulting 
illustrations, and (4) minimizes the time it takes to 
draw them. 


The three basic levels of design and execution 
in the idiomatic illustration process are called spatial 
grammar, form vocabulary, and visual editing. 
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A, Spatial grammar includes the global features 
which control the overall spatial structure, size, and 
scaling of the illustration. Included are the specific 
graphic objects which represent these global 
characteristics, such as grid scales, guidelines, 
perspective planes, etc. The purpose of spatial 
grammar is to provide a compositional framework 
for the construction, placement, and in some cases 
the measurement of form elements in_ the 
illustration. 


B. Form vocabulary includes the intermediate-level 
graphic objects which represent the subject matter 
elements of the illustration, such as bars, curves, pie 
slices, blocks, symbols, location points, routes, areas, 
shapes, volumes, etc. Their purpose is to provide 
the user with an appropriate range of prefabricated 
form elements and property options to satisfy both 
the technical and the esthetic requirements of the 
idiomatic illustration. These elements are transfer 
symbols. Spatial placement of form elements is 
automated wherever appropriate, and is otherwise 
manual. 


C. Visual editing includes special modifications, 
additions, and/or deletions of form details on the 
lowest level of form selection. General graphics 
tools are used on this level to enable a higher degree 
of individualization in the illustration: to improve 
its appearance or esthetic impact, to enable stylistic 
details, to visually relate or differentiate parts, to 
emphasize or subordinate particular elements, to 
make technical corrections, to add supplementary 
notes or labels, and to generally polish the 
illustration to make it a more articulate visual 
statement. 
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6000 1. Curve Graph Illustrator 
User Overview 
4000 
The curve graph illustrator represents a 
oe progressive sequence of data points in relation to a 
ics © COLD-STaRT | scaled (grid) frame of reference, and generalizes 
those points in the form of a linear trend curve. 
QO beaate : ; 
re) 50 1l00.~=—s«#50s—«200__—s-250 Data point graphs show only the specific 


locations of recorded information, without 
‘visualization of apparent trends. Close data 
sampling favors this type of presentation. For 
accurate data measurement, a full scale can be used. 


Straight line-pieced graphs connect more 
spread-out data points into trend ‘curves.’ Point-to- 
point connection retains visual evidence of data 
point locations while at the same time shows their 





100 —— TTA interrelated trend. Areas of interest between curves 
Uy can be color-filled for emphasis. 

ae YY Yj Spline-curve graphs create continuous trend 

- Uy curves which pass through all the points in a data 

WY sequence. Specific data locations are less important 

+6 : yg Gy than their projected continuity. Tick scales offer a 

“TY clearer visual field for curve display. In the 

60 ML U, accompanying example, a scale segment is shaded to 


1939 i340 194! 1942 1943 . . ‘ P 
identify an area of special meaning. 








i _ 
eH : Least-squares-best-fit graphs maximize the 
i) a | 
am EY generalization of data point sequences into simple 
ae : fl trend lines which do not necessarily pass through the 
i tit points. A log scale is used in the accompanying 
15) RS 
example. 
al P 
SHE EEE 
5009 
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Scenario 
e i a i, 
The curve graph illustration process is as f~ = 4° 
follows: ; = 
; § 30 
A. Spatial grammar a a 
ere 
First priority in creating a curve graph is setting | - ke = 
the ordinate and abscissa (vertical and horizontal) eee 
scales. They can be cartesian, logarithmic, or double | j__-— 
logarithmic in terms of measurement and can be se Baas ‘ 
edge-based or based on an internal axis. iO FF 4 273,43 
plage See see eet oe ore i 
B. Form vocabulary a ee eer: a eeu 
In the curve graph, coordinate data points are er 
{ sd 5 


automatically plotted by the system from numbers | 
typed into a table. These points can be interpreted ! 
into one of several curve interpretations: _least- , Pe lag 
squares-best-fit, line-pieced, spline, etc. Ordinate 's = 20 
and abscissa labels, as well as curve names, are also , © 1—~— 
typed into the table and placed appropriately in the ! . as 
figure by the system. A choice will be made; 1i1-—— 
between representing the coordinate data as points Fe - oO 
only, points plus curve, and curve only. One of 
three point/line sizes can be chosen for representing 
curves. One of five pre-designed point structures 
can be chosen for representing and differentiating 
sequences of coordinate data points. One of five pre- 
designed curve structures can be chosen to represent 


different curve lines. 





C. Visual editing 


ordinate 


After the curve graph is executed, it can be 
edited for detail-level additions and/or changes. 
Grids and ticks, points, curves, and labels can be 
individually selected for property changes. Curve 
labels, if not produced automatically, can be created 
and placed. Lines can be added, such as manually- 
constructed curve segments, scale-oriented index 
lines, arrowheads, callout lines, etc. Other visual 
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edits might include outlined rectangular areas and 
shaded backgrounds, additional text labels, 
identifications, notes, etc. 


2. Bar Chart Illustrator 
User Overview 


The bar chart illustrator represents subject 
matter data as comparative bar quantities, measured 
against an ordinate scale. 


Divided-bar charts compare subdivided 
quantities, the parts of which are represented as 
additive amounts. Visual differentiation (by color 
and/or texture) is important. 


Split or mirrored bar charts are constructed on 
a compound scale to show positive/negative values 
or to compare pairs of related subject elements, e.g., 
men and women, etc. 


Grouped-bar charts show sets of related subject 
elements in a comparative rather than additive way 
(as in divided bars). Horizontal bar orientation 
provides better space for long labels. 


Floating-bar charts show quantitative range 
rather than fixed amount. This type of chart is 
commonly used to represent phased scheduling and 
performance. 
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Scenario 


The bar chart illustration process is as follows: 
A. Spatial grammar 


The first step in making a bar chart is setting 
the ordinate scale. The linear scale structure can be 
varied in size and is able to grow tick and/or grid 
lines at selected major and minor subdivision 
locations. The scale can be oriented vertically or 
horizontally and can be edge-based or based on an 
internal axis. Bars can show quantities whole or 
subdivided, based or floating. In addition, bars can 
be organized singly or in groups; and if single, can 
be spaced or joined. The grouped option requires 
the user to type in the number of bars per group- 
unit. The system automatically rearranges the bars 
and spaces to show the new organization. The bar 
widths expand to occupy any new space made 
available by the closing up into groups. One of five 
options can be chosen for representing the ratio of 
bar width to its accompanying space. 


B. Form vocabulary 


Measured bar heights are automatically plotted 
by the system from numbers typed into a table. Bar 
names are also typed into the table and placed 
appropriately in the figure by the system. The table 
format itself will vary depending on the type of bar 
represented. One of five colors and/or three 
textures can be chosen to fill bar pieces. 


C. Visual editing 


After the bar chart is executed, it can be edited 
for detail-level additions and/or changes. Scales, 
bars, and labels can be individually selected for 
property changes. Special modifications might 
include: scale-oriented index lines, arrowheads, 
callout lines, individual bar shading changes (for 
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‘projected’ quantities, etc.), grid line emphasis, area 
Outlines, shaded backgrounds, ‘interrupted’ bar 
breaks, special key constructions, additional text 
labels, identifications, notes, etc. 


3. Pie Chart Illustrator 
User Overview 


The pie chart illustrator represents divided 
whole amounts in terms of percentage values of the 
whole. These component values are represented by 
slices which comprise the pie. 


Sliced segments can be differentiated with color 
shading and/or textures. Labels can be placed 
internally or externally depending on available space 
inside the slices, size of labels, and style preference. 


Individual pie slices (or groups of slices) can be 
separated from the whole pie for the purpose of 
emphasizing segment differences at the expense of 
the overall pie unity. 
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Scenario 


The pie chart illustration process is as follows: 
A. Spatial grammar 


The first design action in making a pie chart is 
to expand (or shrink) the figure model to the desired 
gross working size on the screen. This can be done 
by pinning the center and_ stretching the 
circumference manually. One of five line weights 
can be chosen with which to represent the pie chart. 





B. Form vocabulary 


Contained or separated slice sizes are 
automatically calculated by the system from 
numbers typed into a table. The numbers are 
translated into percentages, which are then 
converted into degree equivalents that enable 
appropriate placement of the slice lines. Labels are 
also typed into the table and can be placed 
automatically or manually. One of five alternative 
colors (white, light gray, medium gray, dark gray, 
black) can be chosen for filling individual slice areas. 
One of three alternative textures can be chosen for 
filling individual slice areas. Textures can be 
combined (overlayed) with each other and with 
colors. 





C. Visual editing 


After the pie chart is executed, it can be edited 
for detail-level additions and/or changes. Pie slices 
and labels can be individually selected for property 
changes. If pie slice callout labels and lines were not 
automatically produced, they can be constructed in 
this editing phase. Other modifications might 
include: changes in quality of slice shading or 
texture, a constructed key for remote identification 
of slices, arrowheads, additional callout lines, text 
labels, identifications, notes, etc. 
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4, Diagram Illustrator 


User Overview 
WASHINGTON COUNTY VOTERS . . ° ° 
The diagram illustrator shows organization, 
Gaon flow, process, system, and functional arrangement. 







Block diagrams show organization and/or flow 
of abstract subject elements in terms of mutual 
interrelationships. Elements utilize primary-form 
eon al j toe ne} [« shapes which are text-identified. Business 
organizations and industrial and computer systems 
are common subject matter for this kind of figure. 






PROFESSIONAL 
abMONZETRATOR 


Schematic diagrams show process and/or flow 
between conventionalized symbol elements as an 
expression of system interrelationships. Electronic 
systems are a major user of this idiom, which also 
includes industrial and architectural wiring 
diagrams. 





Associative diagrams show the formalized 





7 identity and functional arrangement of subject 

a | Ht my elements that comprise a system configuration. This 

or 3 a type of illustration emphasizes graphic abbreviations 

ie See, of recognizable forms to communicate through 
> en oes isomorphic cues. | 
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Scenario 


The diagram illustration process is as follows: - 
A. Spatial grammar 


Diagram composition is facilitated by a grid- 
coordinated spatial guideline system which provides 
a framework of efficient and esthetic placement of 
symbol elements. Symbol elements can be easily 
placed into aligned row/column positions by mouse. 
Guidelines are visible as a layout tool but are not 
visible in the finished illustration. 





B. Form vocabulary 


Basic diagram elements are transfer symbols, in 
the form of symbol ‘templates.. These symbol 
elements can include text labels and can be 
automatically sized to fit the label size. The symbol 
can be constructed with any one of a variety of line 
weights, structures, or colors. Grid-coordinated line 
segments can be constructed between symbols along 
selected gridline paths. Link arrowheads are 
optional. Both links and arrowheads can vary in size 
and structure. 










C. Visual editing 


The diagram can be visually edited for a variety 
of detail-level modifications. | Special symbol 
element shapes can be constructed. Symbol- 
independent labels and notes can be added, as well 
as callouts. Shaded area backgrounds and area color 
reversals may also be used. Curve segments may 
replace straight link lines if desired. 
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5. Map Illustrator 
User Overview 


The map illustrator shows geography, location, 
position, area, direction, and survey. 


Topographic maps show geographical surface 
structure in terms of spatial elevation. This 
information is usually shown in the form of linear 
contour lines, or as chiaroscuro (light and dark) 
shading of physical surface characteristics. 


Area maps show geographical location in terms 
of political and physical features. Population 
distribution, economic data, climatic patterns, etc., 
as well as general geographical information are the 
content of this kind of map. 


Route maps show geographical direction in 
terms of measured linear extensions. Roads, 
networks, trade routes, and directional weather data 
are subject matter for this kind of figure. Physical 
geography is represented only as a frame of 
reference. 





Subdivision maps show geographical content in 
terms of surveyed planar dimensions. This includes 
tract maps, city/county parcel drawings, real estate 
development plans, and other measured land 
illustrations. 
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Scenario 


The map illustration is as follows: 
A, Spatial grammar 


A geographic coordinate scale is set as the basic 





frame of reference for the measured location of all i’ | 
map elements. This scale can display longitude and 2 
latitude marks and can vary in its type of form 4 
structure, projection method, and  data/label 37 
content. 
B. Form vocabulary 43 122 


LOKiGrrVvoe 
Subject elements are placed within the scaled 
map area. Primary categories of map elements 
include landmass boundaries (coastlines, rivers, 
islands, etc.), location points (cities, airports, etc.), 
routes (roads, weather patterns, etc.), and areas 
(states, regions, etc.). These can be organized within 
the geographic scale area by class, type, locale, etc., 
for communication purposes, and graphically 
represented by one of a variety of line weights and 
structures, colors, textures, etc. 





C. Visual editing 


Detail-level modifications are enabled by the 
general graphics functions and include special- 
purpose location forms, one-of-a-kind line or color 
properties, and unique graphic changes to transfer 
symbol elements. 
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6. Plan Illustrator 
User Overview 


The plan illustrator shows layout, structure, 
external shape, and correlated views. 


Floor plans show surface structure layout in 
terms of measured dimensions. This is basically an 
architectural application used to present a ground or 
floor level representation of structural forms, utility 
locations, and finishing details. 


Section plans show internal structure in terms 
of a straight cut through a given plane. This can 
apply equally to architectural, engineering, or even 
biological subject matter. Internal material 
constitution is often indicated through coded 
conventional texture patterns. 


Side plans show the external shape of a subject 
as a measured parallel view. This has a general 
application to many fields. In architecture, this kind 
of view is called an “elevation.” Parallel visual 
features are emphasized. 


Composite plans show two.or more correlated 
and measured views of a subject. Two- and three- 
view plans are common both in architectural and 
engineering drawings. Interactive guidelines and 
dimension lines are usually included in this kind of 
figure. 
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Scenario 


The plan illustration process is as follows: 
A. Spatial grammar 


A vertical/horizontal measurement scale is set 
to provide a basic frame of reference for the 
measured layout of all plan element views. This 
scale can vary in its units of measure and its method 
of display. It is visible as a temporary layout tool, 
but is not a permanent part of the illustration. 


B. Form vocabulary 


Subject elements are created/placed within the 
scaled layout area. Major plan elements include 
object shapes (as planar views, elevations, sections, 
etc.), conventional field symbols (architectural, 
engineering, etc.), and descriptive paths (system 
links, center lines, etc.). Object shapes in particular 
are scale-measured in their construction and often 
accompanied by secondary dimension-line elements. 


C. Visual editing 


Special editing modifications to plans could 
include color emphasis of certain areas of interest, 
enlarged representation of details, differentiated line 
vocabulary, and special changes to symbol elements. 
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7. Perspective Illustrator 
User Overview 


The perspective illustrator shows appearance, 
internal constitution, and external composition. 


Objective pictorials show the _ natural 
appearance of an object under normal 
circumstances. Linear perspective is a major tool in 
this idiom, often reinforced by chiaroscuro 
(graduated light and dark shading). Environmental 
features may be included or omitted depending on 
the communicative purpose. 


Structural pictorials show the internal 
constitution of an object through an imaginary 
“cutaway, or separation from the total form of the 
subject. “Phantom” views which superimpose 
internal and external views are also used to show 
physical structure. 


Assembly pictorials show the composition of a 
physical system in terms of its component parts. A 
primary purpose of this is to show the functional 
relationships between object elements. Perspective 
alignment and guidelines are important features. 
This kind of figure is often called an “exploded” 
view. 
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Scenario 


The perspective illustration process is as 
follows: 


> A. Spatial grammar 


A perspective layout scale is set to provide the 
basic framework for organizing and constructing 
volumetric form elements. This framework can vary 
in its layout method (one-point or two-point) and in 
its picture plane data (location of horizon line, 
vanishing points, focal point, view points, etc.). The 
volumetric figure can be created through direct 
construction or projection from plan views. 


B. Form vocabulary 


Subject elements are created/placed within the 
perspective framework. Perspective form elements 
include plane shapes (2-D object sides, components 
or viewpoints), and spatial paths (receding center 
lines, linear forms, etc.). All volumetric objects 
decrease in size in direct proportion to their scaled 
recession into perspective depth from the picture 
plane. 





C. Visual editing 


Detail-level editing for special features is 
enabled by the general graphics facilities, and 
includes alteration of line weights and shape colors 
and textures for local form description. Special 
notes can also be added through visual editing. 
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Techniques for Interactive Raster Graphics 


Patrick Baudelaire* and Maureen Stone 
Xerox Palo Alto Research Center 


ABSTRACT 


The visual quality of raster images makes them attractive for applications such as 
business graphics and document illustration. Such applications are most fully served 
using interactive systems to describe curves, areas and text which can be rendered at 
high resolution for the final copy. However, to present such imagery in an interactive 
environment for moderate cost is difficult. Techniques are presented that provide 
solutions to the problems of scan conversion, screen update, and hit testing for a class 
of interactive sytems called i//ustrators. The design rests on the use of software display 
file encoding techniques. These ideas have been used in the implementation of 
several illustration programs on a personal minicomputer. 


Key Words and Phrases: computer graphics, interactive graphics, display encoding, 
chain encoding, run-length encoding, scan conversion, illustration systems. 


CR Categories: 8.2 


Introduction 


Pictures are a part of a large number of applications. A certain class of pictures can be 
referred to as i/lustrations. That is, the point of the picture is to illustrate a principle as part of 
an article or presentation. Good illustrations are visually interesting, but are judged more on 
their content than on their artistic merit. With respect to picture complexity, illustrations are 
simple in that there are a moderate number of shapes. clearly bounded by curves and lines. 
The image is essentially two dimensional, and color is often restricted to flat filled areas: that 
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is, uniform colors and simple textures. Illustrations are important in business and publishing 
environments today. The advantages to be gained from using a computer to generate these 
images are similar to the ones for word processing: images are easily modifiable, pictures can 
be filed and copies easily generated, subimages can be libraried to facilitate the composition 
of a series of illustrations. However, to be acceptable outside of the experimental 
environment, the image quality of illustrations must be high. Curves and lines must be 
smooth, text must be represented in a variety of fonts, objects must be accurately positioned. 
While each illustration may contain a limited number of colors, a wide range must be 
available. 


Given the type of imagery desired, a raster display will give much better representation 
of the picture than a line oriented display. The constraint of high image quality, even after 
scaling and repositioning operations, means that the picture must be stored using a high 
precision representation. Furthermore, if the data base has sufficient precision, the set of 
affine transformations can be used as tools for generating images. 


The process of designing an image is the mapping of some visualization onto a medium. 
Therefore, that any effective tool should provide visual feedback as the image was formed. 
Furthermore, the image should be built up in a two-dimensional manner, that is, by pointing 
to positions on a page. which indicates an interactive system. In this paper, such a system will 
be called an é//ustrator. 


Given a high precision data base, such as endpoints for lines, coefficients for splines, plus 
information about the resolution of the display. there are many standard techniques for 
displaying straight lines, curves and areas (5). This process is known as scan conversion. 
There are a number of problems in using these techniques in an illustrator. specifically: 


e All straight lines and curves must be represented with a finite width. In an 
illustration, using lines of different widths is part of the visual effect. Therefore, 
displaying these shapes involves more than just digitizing the curve. 


e Areas are discribed by their outline. Therefore. it is necessary to compute which 
points are inside the outline and color them in accordingly. While there are many 
techniques which address this problem (10, 11) in an interactive system the speed at 
which this can be done is an important factor as the display is constantly changing as 
the user works. 


e [In an interactive system. the display is constantly changing. For speed 
considerations, it is desirable to do this incrementally. updating only the areas that 
are affected by the change. Incremental refresh can cause problems displaying the 
correct overlap order for objects. 


SIGGRAPH 86 TUTORIAL COURSE NOTES 


TECHNIQUES FOR RASTER GRAPHICS 173 


In an interactive system it is necessary for the user to select objects to be manipulated. A 
natural way to do that is to point to the object. Therefore, given a point on the display, it is 
necessary to determine which object has been selected. This is called Ait testing. Hit testing is 
the inverse of the display problem, and is even more speed critical than refresh. 


This paper will discribe methods for solving these problems that achieve a reasonable 
compromise in simplicity of implementation and response time. The solution is based on 
using display encoding techniques which provide a representation of the image that is 
compact, structured, and simple to manipulate. These techniques have been implemented in 
systems which have been used to successfully create illustrations. Examples are included at 
the end of the paper. 


Data Representations and System Design 


Graphics systems can be categorized by the type of data used to represent the picture. 
One type of interactive graphics systems uses raster dots or arrays of intensity samples as the 
unique representation of the image. These systems use a painting model to manipulate the 
dots directly. These systems are easy to use, and have been shown to produce effective 
imagery (8, 9). However, the picture is unstructured, which limits the types of manipulations 
that can be performed without special hardware. The resolution is tied strongly to the display 
resolution, which can limit image quality. In general. to be effective visually. the images from 
such systems require large amounts of data, either in resolution or in bits per pixel. We have 
chosen a more structured approach. 


Given a geometric representation, one can consider that geometric elements (lines, 
curves, areas) are all made of filled contours. This paradigm is quite useful for applications 
such as the production of rasters for high-resolution printing of graphics and text (1). 
However, this model is difficult to implement in an interactive application because using the 
geometric data for display and hit testing can be expensive in terms of computation time. An 
interactive display file can be used to provide a definition of the picture that is structured, yet 
can be manipulated fast enough for an interactive system. 





[t is natural for several representations to coexist in the design of a graphics system. 
usually according to levels of hierarchy. At the top level may exist a representation that 
embodies some specific knowledge or meaning relevant to a particular application: 
architectural drawings, blueprints for mechanical parts. electrical circuit diagrams. etc. Here 
we will consider that the top level representation that we chose aims only at giving a 
unambiguous and complete description of high-quality business graphics illustrations. 
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To implement these ideas, the following systems approach was used: The workbench is a 
moderate resolution raster display plus pointing device attached to a 16 bit mini-computer. 
The final output device is a high resolution raster device such as a film recorder, 
phototypesetter, or raster printer. Shapes are defined by their geometry: trajectories and 
contours; plus style informations such as line width, colors, and textures. Trajectories can be 
specified by one of several mathematical schemes such as_ splines or other knot-based 
approximations, circles or other conical equations. Text is unformatted, and described in 
terms of position and string information plus style parameters such as font, color, and 
orientation. The user builds display objects by pointing at fitting points and indicating fitting 
methods such as straight lines or curves. All numbers are represented as floating point values 
to provide sufficient precision. The top level description is converted to an interactive display 
file, where the interactive processing, refresh and hit testing, will take place. The display file 
is used to generate a bitmap, that is a one bit per point rectangular array of samples (13), 
which is used as the refresh buffer for the display. This structure is summarized in figure 1. 


Output Device 
(high resolution) 


Picture Definition 
(geometric representation) 


Interactive 
Display File 


(encoded representation) 





Display 
(raster representation) 


Figure 1: Levels of representation. 


The Interactive Display File 


The display file representation is used for refresh and hit testing. In designing such a 
representation. the first consideration is to what precision the objects should be encoded. 
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Clearly, display resolution is sufficient for refresh. Since the user cannot specify a position by 
pointing which is more precise than the display resolution, it is also sufficient for hit testing. 
Both hit testing and incremental refresh involve scanning through the display file to 
determine what objects are in a specified area. Therefore, partitioning the display file to 
facilitate culling will increase performance. It is important that the display file be reasonably 
compact, yet not so difficult to generate or decode that it negates the advantages in terms of 
speed. It is also convenient to be able to encode all objects in a uniform manner. 


Given these considerations, we have applied and modified some well known encoding 
techniques, chain encoding of trajectories (3, 6) and run-length encoding of areas (2, 4). These 
techniques are essentially compressed representations of the bitmap. Chain encoding is based 
on the assumption that edges are continuous. Therefore. the basic representation is the 
differences between adjacent pixels. Run-length encoding is based on the assumption that 
flat areas contain most of the information in the outlines. Therefore, the basic representation 
is the position of the edge on each scan line. While it would be possible to run-length encode 
all objects, the increased structure in the chain encoding is appealing for lines. 


The particular encoding schemes chosen permit the segmentation of each object into 
pieces that are independent and bounded in display size. It follows from this that the time for 
display of one piece is bounded too. This makes it possible to run the screen refresh as a 
background process. 


All shapes can be described by one of the two types of encoding. Thus. pointing 
detection can be done by a single algorithm. independent of the mathematical definition of 
the elements (lines, circles, conics, splines, etc.). 


Chain Encoding 


Chain encoding is a differential encoding scheme which records the screen coordinate 
increments between successive raster points on a trajectory. That is. from one trajectory point 
to the next, raster coordinates may differ only by -1, 0 or 1. Thus, a point may have eight 
possible successors. so each point new position could be represented in four bits. But. since 
common curve trajectories are monotonic for reasonably long intervals. one can take 
advantage of the continuity in direction to further reduce the number of bits per point. A 
number of schemes are possible for encoding coordinate increments. Two interesting and 
practical ones are described below. __ 


The first scheme uses two-bits to represent the coordinate increments (figure 2). The set 
of eight possible curve directions is divided into four quadrants. For each direction quadrant. 
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Quadrants 





quadrant quadrant 


Figure 2: Chain Encoding, Two Bits/Direction. 


the three possible coordinate increments are assigned code values 1 to 3. Code value 0 is used 
to indicate a change of quadrant. with the following two bits specifying the new quadrant. 
Therefore, the trajectory encoding is a stream of two-bit codes, starting with the quadrant 
number (0 to 3), followed by increment codes (1 to 3), terminated by a 0. 


The second scheme requires two streams (figure 3). The set of eight possible curve 
directions is divided into eight octants. Within each octant there are two possible directions. 
Therefore, it is possible to indicate each step within an octant with one bit. One stream 
contains the one-bit increment codes for a given direction octant, and the second stream 
contains the octant numbers and the number of steps in each octant. Besides being somewhat 
more efficient, it is possible to get a general idea of the behavior of a curve segment by 
examining the direction octants alone. An example of where this feature can be used is for 
defining edges for the scan converter. Using the chain encoding in the scan conversion 
process will be described further below. 


As mentioned above, in order to gain efficiency in the screen updating and pointing 
detection algorithm, the encoding is fragmented into independent pieces of similar length that 
we call chunks (figure 4). The bounding box for each chunk can be stored to facilitate culling 
on refresh and hit testing. 
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Figure 3: Chain Encoding, One Bit/Direction. 
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Figure 4: Curve Divided into Encoded Chunks. 
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The display file contains the following information for each chunk: 
Screen coordinates of the starting point S 

Stream(s) for the chain encoding 

The bounding frame: H and W. 


It is interesting to note that if the chunk size is such that each segment contains at most N 
trajectory points, all these points are enclosed in a square of size 2N centered at the starting 
point of the segment. So even if the bounding box is not explicitly stored with the chunk, a 
bounding region can be computed. 


Run-Length Encoding 


Run-length encoding defines an area in terms of a starting scan line (Y) value plus a list 
of pairs of raster (X) values. In practice, it is more efficient to make the second X value 
relative to the first, so each run is defined as a starting X (SX) and a delta X (DX) value. The 
list of runs can be broken into chunks, such that each chunk defines a maximum of N runs. 
Therefore, each chunk has a bounding frame. It is desirable to make the starting X values 
relative to this frame, so that the chunk can be relocated simply by translating the frame 
boundnies. (figure 5) 


The encoding described above works only for convex areas, specifically, it assumes one 
continuous run of rasters per scan line within the area. For concave shapes, there are two 
options: break the area into convex pieces, or introduce a flag into the list of runs that defines 
the number of runs per scan line. We have implemented the second option, choosing a 
negative starting X as a flag, signaling that the delta X value is the new number of runs per 
scan line. 


Assuming the run-length encoding describes areas at display resolutions. the starting and 
delta X values need be no larger than the maximum display coordinate. In practice. most 
runs can be described in fewer bits. Therefore, it is worthwhile to consider using a variable 
field for X. While maximum compression would be obtained by using a technique such as 
Huffman coding for field size. for simplicity we have chosen to implement two fixed fields (8 
and 16 bits). 
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Figure 5: Run-Length Encoding, Showing Chunks. 


The display file contains for each chunk: 


Frame boundry: upper left point S in screen coordinates (includes the starting Y value). 
plus frame size H and W. 


Field length: either bytes or words. 


List of run values, defined as SX relative to S. and DX relative to SX. 


Scan Conversion 


The geometry to rasters conversion scan conversion process can be decomposed in two 
operations: Converting the geometry into the display file representation. and converting the 
display file to rasters. The conversion from geometry to display file only need be performed 
when an object is created or the shape is changed. Because this is a relatively infrequent 
operation, standard techniques for digitizing curves and filling areas provide acceptable 
speed. The specific algorithm used for areas is given at the end of this section. Objects are 
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displayed in back to front order to give the correct overlap information. All display, refresh, 
repositioning, and hit testing operations can be implemented by manipulating the display file. 


Display 


The display operation is defined as the conversion from the display file representation to 
the rasters which are stored in the bitmap. The final action of writing raster bits into the 
bitmap is implemented by an efficient and versatile firmware function called RasterOp. This 
function copies and modifies bit patterns from an arbitrary rectangular area in picture 
memory to another rectangular area. RasterOp is described in more detail in (13) and (5). 


For areas, the run-length encoding can be displayed directly using RasterOp to display 
each run as a one line high area. To draw curves we use the paradigm of painting with a 
"brush" moving along their trajectories. The chain encoding provides a digitized 
representation of the trajectory. A round brush will approximate a line of uniform width 
equal to the brush diameter. In the simplest implementation of this model, RasterOp can be 
used to paint an image of the brush at each new raster position along the trajectory. However, 
because the successive brush images overlap, many of the pixels along the trajectory are 
written several times. The model can better be implemented by breaking down the brush 
into a set of horizontal sections and accumulating the rasters filled as the brush moves. Once 
a scan line is complete, that is, the brush has moved far enough that it no longer touches the 
scan line, the entire horizontal section can be displayed using RasterOp. This implementation 
is more efficient than the traveling brush because the affected rasters are only written once. 
Furthermore, the form of RasterOp used to display lines has been reduced to the same case 
used for areas. This uniformity makes applying colors or halftones to objects straightforward. 


Refresh 


In an interactive graphics system. the displayed image is constantly changing as the user 
adds, transforms, and deletes objects. In a system which contains only lines and curves, it is 
possible to write all new objects as they are created or repositioned, and to simply leave the 
small areas that are erased out of overlapping objects (from a transform or delete operation) 
unrefreshed. The user can then replot the entire picture when the image degrades too much. 
For systems which include filled areas, this approach is inadequate because the amount of the 
picture obliterated by erase operations overlapping other objects is too extensive. It is 
therefore necessary to find some way to quickly and accurately refresh subareas of the entire 
picture. We will call this process incremental refresh. 


The screen update process should be as fast as possible, yet must leave the screen in the 
correct state as to the shape of the objects and their overlap order. I[t follows that the design 
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considerations for an incremental refresh algorithm are: 


The definition of an area to refresh should contain a minimum number of rasters that 
need to be regenerated. 


Objects within the refresh area must be replotted quickly and with no rippling effects. 


If the incremental refresh is going to run as a background process, the time necessary to 
refresh an area must be bounded. 


In general, the affected area has such a complex boundary that determining exactly 
which rasters fell inside the outline would be too time consuming a process. It is therefore 
necessary to use some approximation to the area. The simplest approximation is a rectangle. 
While for certain operations, such as erasing a diagonal line. the minimum rectangle that 
describes the area affected covers far more of the picture than actually need be redisplayed, 
rectangles are much easier to manipulate than other shapes such as trapezoids. 


Once a refresh area has been defined, the objects which are affected must be found. This 
is the same process as hit testing, except that each object must be compared to the boundaries 
of the refresh area instead of a small area around a point. If the display system contains an 
efficient clipper with variable clipping boundries, the update problem can be solved simply 
by setting the clipping region to the boundries of the affected area and refreshing the entire 
screen. In the type of system we are describing here. since the rasters are generated from the 
display file, the partitioning of the display file into bounded chunks provides a method for 
fast culling for this type of refresh. 


[f the clipper is not used, the problem of rippling effects can arise because the object or 
chunk definition will in general generate rasters outside of the refresh area. which can affect 
objects not currently in the list of objects to be refreshed. For example. in figure 6a. object A 
is shown as overlapping object B. Part of object B needs to be refreshed (figure 6b). In 
refreshing object B, care must be taken that the correct overlap order is maintained between 
A and B. If overlap order is determined simply by back to front refresh of the display. just 
replotting all of B will result in B appearing to be on top of A (figure 6c) unless figure A is 
also refreshed. 


It is common for the user of an interactive graphics system to operate on several objects 
in a picture at one time leaving several areas needing to be refreshed. These objects may or 
may not overlap. One approach ts to simply accumulate a maximum area as each object is 
operated on. However, if the objects are disjoint, it can give a very bad estimate of the 
affected area. Another approach is to treat each object independently. However. if the 
objects overlap, intersecting refresh areas will be redisplayed several times. which, while 
leaving the display in a correct State. is distracting to the user as well as time consuming. A 


DOCUMENTATION GRAPHICS 






































a 


182 BAUDELAIRE AND STONE 





c) Replotting all of B gives 
incorrect overlap 


Figure 6: Rippling Effects of Refresh. 





third approach is to accumulate a refresh area for each object. and then process these areas to 
eliminate overlap cases. In a system where RasterOp is the limiting factor for display 
operations, the third approach is far superior to the other two. 


Area Scan Conversion 


Polygon scan conversion is described in detail in chapter 16 of (5), and much of the 
terminology in this section will be taken from that source. Here we would like to outline an 
approach that has been shown to be adequate for line and curve bounded areas, including 
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concave areas, areas with holes, and areas with twists. 


The problem is to display solid areas which have overlap order but no depth. That is, 
areas do not intersect in the Z direction. The areas are bounded by combinations of spline 
curves and lines, and are not strictly convex. The pictures displayed are of moderate 
complexity, probably around 20-25 areas. 


Overlap order is resolved using the painters algorithm. That is, objects are displayed in 
overlap order, back to front such that front objects simply “paint over" objects that are 
behind them. 


The outline of each area is chain encoded. Each chunk is constrained to be monotonic. : 

Scan conversion occures at display resolution, using the chain encoding as the edge definition. 
Each area is taken separately, and the outline is broken into a list of "edges" which are 
monotonic. in the scan direction. This is determined by examining the chunks of chain 
encoding. These edges can be sorted on Y, then the intersections for each scan line are 
determined by running along the encoding. The X values are sorted. and taken pairwise to 
define the filled interior of the object. This is essentially the Y-X algorithm described for 
polygons in (5). 


Care must be taken when defining the edges that endpoints for edges are properly 
defined. The chain encoding is one continuous stream with one definition for each raster 
point. However, it is essential for the Y-X algorithm to have an even number of edges for 
each scan line. Therefore, the endpoint of edges that fall at maxima/minima of the object 
must be doubled. Furthermore. points which are not at a maximum/minimum in the scan 
direction must not be defined on two edges. This can be achievea by making edge boundries 
only at maxima/minima in the scan direction (figure 7). 


It is important to note that there may be horizontal sections inside an edge (figure 8). 
Therefore. it is necessary to return a range of X values for each edge intersection at each scan 
line. The left or right value is taken depending on whether the edge is a left or right edge. 
This determination takes place after the X values are sorted. 


The main advantages to this approach are speed and consistency. All curves are 
converted to rasters using some standard algorithm. Once this is done. any analysis of the 
curve. such as for monotonicity or for intersections. is done using the chain encoding. 
Furthermore, if the area is outlined. the outline and the edge of the filled interior are 
guaranteed to match since they come from the same digitizing algorithm. 
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/* 





Figure 7: Edge Definition. 


Hit Testing 


By the term “hit testing’ we mean given some target point. usually the display 
coordinates of a cursor, which is the selected object? In hardware augmented systems, hit 
testing is done by the clipping and display system. The clipping boundry is set to a small 





Figure 8: Horizontal Sections Inside Edge. 
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window around the hit point, and the entire picture is refreshed. An object falling inside the 
window is returned as a possible hit. In a raster system, redisplaying the object list is too slow, 
and redisplaying the rasters provides no structural information. However, the segmented 
encoding provides both structural information and a way to quickly determine which objects 
are candidates. 


The frame information on the encoding chunks provides a method for quickly culling 
out those segments which do not fall near the target point. The remaining chunks can be 
decoded using the same routines which display the encoding, except that the resulting points 
are compared to the target point instead of being displayed. Once the objects which lie near 
the target point are identified. some algorithm must be applied to select one of the objects. 
Some considerations are: absolute distance from the point. overlap order of the objects, 
prefered objects, etc. 


Conclusions 


Using these design principles, it is possible to make an interactive system which uses a 
raster display for design, yet has a geometric data structure which can be used to generate 
quality output on a high resolution raster device such as a film-recorder, a photo-typesetter, or 
a laser printer. Two systems using these principles have been designed and implemented on 
the Alto personal computer. One provides only lines and curves, the other also provides the 
capabilty for filled areas. Typical imagery is shown in figure 9 and in the illustrations in this 
paper. 
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Figure 9: (a) Pine T 





Figure 9: (b) Gray Head by Dave Bickford. 
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The graphic designer of the future will work with a color display as a workbench and a 
powerful, interactive computer system as a toolbox. This prediction is based on the 
wide use of computers for text processing and production layout. Just as word 
processing systems have replaced typewriters, computer-based design systems will 
replace pen, knife and light-table for certain types of graphic designs. Included in this 
vision of the future are tools to manipulate color. This paper will examine some of the 
issues associated with color in such an environment. 


Introduction 


Our goal in the Imaging Research Group of the Computer Science Laboratory at the 
Xerox Palo Alto Research Center is to understand how to control color in our computing 
environment, which consists of powerful, personal workstations connected via a high-speed 
digital network to each other and to a variety of hardcopy output devices, specifically, digital 
printers and film recorders. Because there are many uses for color in this environment no one 
technique will support all applications. We need to provide a framework which supports a 
variety. of different methods for using color. : 


The principal application discussed in this paper is computer aided graphic design. The 
designer uses interactive tools to form a picture on the workstation display then sends a 
printable representation of the design to one of the output devices. The monitor never gives 
exactly the same appearance as the printer. but a good approximation will minimize the 
number of passes through the design/print/look/adjust cycle. 
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Some designs are not directed towards a specific output device but are intended for use 
throughout the environment. Then the workstation monitor is no longer a “proof copy” of a 
specific printer but one of a family of media, each of which has its own capabilities such as 
color gamut and resolution. For this problem it is important to define methods that make it 
possible to represent and render a design effectively across a range of devices, a concept we 
call device independence. 


Let us take some examples of the kind of designs that are done here and what the color 
issues are. 


Figures 1 and 2 show designs containing regions of flat, filled color. The lilac picture was 
designed with an interactive illustration program, the spiral by mathematical transformations 
of the Xerox logo font. The initial colors were selected by name from a color palette. Once a 
test print was run, the colors were adjusted using a color order system. If the color names, 
color selection and proofing systems model the printer accurately, this type of design is 
Straightforward. If they do not, it may take many adjustments to achieve a Satisfactory result. 
At the time these images were designed we did not have a good model of the color printer. 
Both the pale yellow-green for the background and the dark colors in the Xerox spiral were 
difficult to control. 


The foreground pattern in figure 3, the dragon head. flame and border, was designed 
with an interactive illustrator. Using a simple program, this pattern was repeated on a 





Figure 1: Lilacs; Figure 2: Xerox Spiral 
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Figure 3: Dragon poster 


diagonal grid to form a watermark. Each element of this design is colored with a smoothly 


varying hue. The dragon’s flame and border color is a gradation from dark to medium 
reddish-brown. The watermark color interpolates from green to blue, and the background Is 
similar to the foreground. The designer achieved these affects by working with the three color 
separations directly. He built a pattern that generated an intensity gradation from light to 
dark with variable extremes and rates of change. Overlapping these patterns in different 
separations creates a wide range of interesting effects. The foreground pattern. for example. 
involved gradations in the magenta and yellow separations. The watermark colors involve the 
cyan and yellow separations. To choose his colors the designer created a set of samples 
showing different pairs of toners at different screen densities similar to the Pantone 
(registered trademark) book for offset color printing. This gave him samples of points along 
the gradation. He used this to set the initial values for the gradations and did the final 
adjustments by trial and error. 


This design illustrates the issue of specifying and reproducing a continuously changing 
color. The specification of the color gradations here was tied directly to the mechanics of the 
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Figure 4: c) Flow diagram with black and outline. 


color separation process. However, an extension of this technique would allow a designer to 
specify a controlled change in any color, leaving the mechanics of color separation to the 
computer system. Using a computer model of a color space we can vary a wide range of 
parameters to achieve interesting effects. 


Figure 4 shows a different set of design considerations relating to color. Here the color is 
used to code two different paths through the flow diagram as shown in figure 4a. The specific 
red and blue are not too important except that they should have similar impact; that is, the 
two paths should appear equally important. They also need to be dark enough that the shape 
of the arrows are clearly distinguishable. Figures 4b and 4c show the same diagram rendered 
for a monochrome device. In figure 4b, each color ts represented by a gray value based on the 
brightness of the original color. While this approximation works well for black and white 
images of scenes, as in black and white photographs, for this illustration the difference 
between the two paths ts lost. Furthermore, gray has already been used as a design element in 
the borders and arrows in another section of the picture. Figure 4c shows the illustration 
redesigned for a monochrome rendering. using black and outlined arrows to distinguish the 
two paths. 
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To solve this type of problem the designer needs to specify the intended purpose of the 
color at a more abstract level than the actual hues. Our approach to this problem is graphical 
style, which we define as representing the rendering characteristics in such an illustration as a 
set of names and rules for rendering [1]. In the previous example, the “color” of the two paths 
would be names such as “HighlightColor-1” and “HighlightColor-2”. The rules for rendering 
these colors would be different for the monochrome device and the color device. The 
collection of all rules for rendering an image is called its sty/e, and is also named. A set of 
related figures would use the same style to ensure consistency. Our intent is to capture the 
design decisions needed to provide a good rendering independent of the content of the 
illustration. 


This paper describes work in progress towards solving some of the problems identified in 
these examples. The remainder of the paper is in roughly two parts. First is a description of 
the characteristics of our color monitors and printers, including a discussion on calibration 
and standards. Following that is a description of the tools and capabilities for color selection 
that can be offered on an computer driven workstation. The conclusion will present a 
framework for addressing the problems of color specification and reproduction in this 
environment. 


Types of color devices 


A color monitor produces colors when the electron beam stimulates red. green and blue 
phosphor dots to produce a pattern of red, green and blue luminous spots. The brightness of 
the dots is a function of the power supplied by the electron beam. The color system described 
is additive and three primaries are independent [5]. A color for a monitor. therefore. is 
specified as a red. green, blue triple, hereafter called an RGB value. A monitor must be 
properly adjusted so the baseline and gain for the three guns are balanced to give an neutral 
gray scale for equal values of red. green and blue throughout the range of black to white. The 
natural relationship between input and luminance is not linear. but for the purposes of this 
paper we will assume that the colors have been compensated to remove this non-linearity [4]. 


A typical monitor can generate on the order of 16 million colors. and a good graphic 
design system would give the user control over the entire gamut. Due to cost or other 
limitations in the computer hardware. however. the specific palette of colors available to a 
design system may be limited to 256 or less. Such a system typically uses a table called a 
colormap to map a number between 0 and 255 to an RGB value. Therefore. while only 256 
colors are available, they can be selected from the complete gamut of 16 million. This 
limitation means color approximation may be an issue for a color design system. The usual 
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Figure 5: Digital simulation of halftone patterns. 





solution to this problem is a technique called dithering, which simulates colors with color 
patterns [6, 9]. Our experience has shown that most of the color gamut can be adequately 
simulated with patterns of 128 colors, leaving 128 for the cases where the approximation is 
inadequate. 


A digital color printer produces color by overlaying spots of magenta, cyan, yellow and 
(optionally) black inks on paper. The spots are all the same size as defined by the resolution 
of the printer. Printer resolutions range from 120 spots per inch for low cost ink-jet to 1200 
spi for phototypeset quality color separations. Intensity variation is achieved by the use of 
patterns called halftones, as in conventional offset printing [13]. The technique of halftoning 
trades spatial resolution for intensity levels as shown in figure 5. Commercial printing 
typically uses halftone frequencies of 80 to 150 lines per inch. To simulate this on a digital 
printer we need printer resolutions of approximately 600 to 1200 lines per inch. Many color 
digital printers do not have sufficient spatial resolution to simulate even an 80 line screen. 
The designer must balance, therefore, the problems of seeing contours in shaded regions 
against the fuzziness and texturing caused by using a coarse screen. 


A typical printer is neither linear nor independent with respect to its primary colors. A 
designer may compensate for this by using a set of color swatches. The number of samples 
needed to adequately cover the color space, however, is very large. One benefit of a computer 
controlled environment is that it is relatively easy to algorithmically define a set of test 
patterns which explore the region of the color space of interest. A better approach is to 
develop methods for converting colors specified in the well behaved RGB color space to a set 
of halftone patterns on the printer. The results of this work are elsewhere in this proceedings 
[12}. 
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Standards and Calibration 


As indicated in the introduction, it is important to be able to create designs that can be 
effectively displayed or printed on all of the monitors and printers in the system. To do so 
implies developing a standard representation of color and calibrating all the devices to it. Our 
goal is to develop this standard calibrated, if possible, to an international standard such as the 
system for computing tristimulus values recommended by the CIE (Commission 
Internationale de |’ Eclairage) .. 


Given a set of parameters that describe the phosphors and the voltage/intensity transfer 
curves, it is relatively straightforward to calibrate a color monitor with respect to the CIE 
coordinate system such that is possible to convert an RGB triple to a tristimulus value, X YZ, 
and vice versa, within the gamut of the monitor [5]. It is also possible, given the illuminant, to 
measure the CIE tristimulus values for samples of inks and papers, and therefore to compute 
the tristimulus value of any region of paper covered by a simple halftone pattern. A simple 
halftone pattern is one involving only patches of opaque ink and bare paper. If two inks 
overlap they must overlap completely and the tristimulus values for the combination must be 
measured independently, that is, they cannot be derived from the coordinates of the two inks. 
For more complex cases, and unfortunately, most useful cases are more complex, things are 
not so well defined so again the work of converting from a well behaved system like RGB to 
the printer is crucial. 


The work reported by Starkweather [12] shows that it may be possible to develop a color 
standard based on red, green, blue primaries calibrated with respect to the CIE standard. 
Conversion of a point in color space between any two monitors would be an algebraic 
transformation. Each printer could be modeled and calibrated with respect to the standard. 
An additional feature is that there is yet another linear transformation that extracts the 
luminance information from the model. that is, the calculation for the tristimulus value Y. As 
in the NTSC broadcast television standard [7], Y can be used to produce an acceptable gray 
approximation of a color on a monochrome device such as a grayscale monitor or black/white 
laser printer. 


For any color specified in this standard, the actual rendering on a device will be an 
approximation of that color. There are a number of reasons why this is so. The lighting 


- conditions typically will are not be precisely controlled. Physical devices have limited gamuts 


and the colors within the gamuts are quantized. sometimes dramatically. Tradeoffs must be 
made between speed and accuracy. To be effective, the color specification must include some 
information from the designer which makes it possible to produce the best approximation of 
that color for that design. [t also seems clear that any form of device independence. meaning 
a Single specification of a color that defines it for all devices. will only work within a range of 
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color values and quality levels. Only as different applications of color in design environments 
mature will it be possible to see how the definition of device independence evolves. 


Color selection 


The novel aspect of the computer driven color display compared to other graphic arts 
media is that it allows a designer to adjust and view colors in real-time. This is different from 
any other color media currently used in the graphics arts industry. Color can be adjusted with 
buttons, sliders or knobs which control parameters in a color space. The computer also 
provides a mechanism for specifying colors and sets of colors in a manner that captures the 
designer's purpose for choosing a color. It allows the designer to specify interrelationships 
between colors in a design, using the power of the computer to maintain the relationships 
while the designer adjusts different aspects of the design. This section will first explore the 
models and tools available for selecting a single color, then will examine some of the issues 
related to coordinating all the colors in a design. 


Color models 


The tools for selecting a single color consist of a color model and provide controls defined 
by the model. We can classify color models as device oriented or perceptually oriented, 
calibrated or uncalibrated. 


Device dependent models 


An example of a device dependent model is the cube defined by the red, green. blue 
primaries of a color monitor. A tool based on this model provides the designer with three 
interactive controls that indicate the amount of each primary to add to the color. A color 
patch displays the color currently defined by the controls. Such a tool allows a designer to 
systematically explore the color gamut for a monitor and select the colors desired. Such a tool 
is simple to implement and is available on most computer systems that support color displays. 
While the user interface often presents a continuous set of values for each primary. most 
implementations are quantized to 256 values. Given that the monitor has been correctly 
compensated. the values are assumed to be linearly spaced in brightness. 


The three subtractive primaries, cyan. magenta and yellow define a cube that is the 
inverse of the RGB system described above. That 1s, the cyan value equals 1-the red value, 
magenta equals l-green and yellow equals l-blue. Most designers familiar with print media 
are familiar with this system. and usually specify the colors as percentages of screened ink in 


SIGGRAPIH86 TUTORIAL COURSE NOTES 





COLOR, GRAPHIC DESIGN, AND COMPUTER SYSTEMS 197 


each color separation. Since the values are specified considering the mechanics of halftoning, 
they are assumed to be linearly spaced in reflectance. 


If a black separation is available, control must be supplied for it also. The black control 
is interesting in that it is not necessarily independent of the three primaries because equal 
amounts of cyan, magenta and yellow also produce black. Undercolor removal is a technique 
for replacing some fraction of the “black” in dark colors with black ink. If the designer 
specified 10% undercolor removal. for example, 10% of the value of MIN{[C,M,Y] should 
automatically be added to the black value and subtracted from the primaries. 


Perceptually oriented models 


While the base representations for our hardware are RGB and CMY, using these systems 
for color control requires significant training. There have been several efforts in the field of 
computer graphics to develop models that 1) require less training and 2) can be mapped 
efficiently onto the RGB gamut. Two are common in computer systems today. They are 
called HSV, for hue, saturation, and value and HSL, for hue saturation and lightness [11, 6]. 
The names of the primaries in these systems are rather confusing. especially since there are 
alternate definitions of these terms in the color literature. Hue ts the least ambiguous, and is 
theoretically represented by a circle with the spectral colors arranged in order around it. 


To map the model efficiently on the RGB color cube. these systems are defined as having 
hexagonal cross sections with red, yellow. green, cyan. blue, and magenta positioned at each 
of the vertices. The HSV model is shaped like a hexagonal cone with white at the center of 
the base and black at the tip. The HSL model is shaped like a double ended hexagonal cone 
with white at one tip and black at the other. In both models. the fully powered primaries and 
secondaries are on the edge of the widest part of the cone and the achromatic axis is the axis 
of the cone. In both models the saturation coordinate defines the distance towards the 
achromatic axis. The value or lightness coordinate defines the distance along the axis of the 
cone. 


The HSL model is similar to the Munsell [10] system. Lightness corresponds to Munsell 
value, saturation to Munsell chroma. The principal difference is that the colors are not 
perceptually spaced but spaced to map easily to the RGB color cube. The brightest yellow 
and the brightest blue, for example, have the same HSL lightness coordinate although they 
have very different Munsell values. 


The HSV model is easiest to use by thinking of the saturation and value coordinates as 
controlling the amount of white and the amount of black in a color. The “black” control is 
stronger than the white: adding white (decreasing the saturation) will only make the color 
white if the value is 1.0. otherwise it makes the color gray. whereas adding black (decreasing 
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the value) always makes the color black. 


Another approach to color selection has been to standardize a set of color names similar 
to the NBS Universal Color Language [8, 2, 3]. Our implementation consists of seven color 
names: red, orange, yellow, green, blue, purple and brown with 3 steps of interpolation 
between them. ie. green, bluish-green, green-blue, greenish-blue and blue. There are five 
lightness levels, from light to dark, and four saturation levels, from grayish to vivid. The user 
can specify a name using buttons, or simply type in the color and adjectives. While not 
strictly speaking a color model, the description is include here because the implementation 
currently maps the names directly to the coordinates of the HSL color model. 


These systems have been combined into an interactive tool that allows the user to adjust 
the color of the patch by changing values in any of a number of ways, thereby developing an 
intuition for the relationship between them. All of these models, even CMY, are defined with 
respect to the RGB color gamut of the specific monitor that is running the tool. The systems 
are uncalibrated, so different monitors produce different colors. Specific colors, however, can 
be calibrated with respect to an external standard if the monitor is calibrated. 


Calibrated models. 


A tool for color selection in calibrated systems must provide the user some feedback 
about the gamut limitations for a particular device. There are a number of design situations 
where this is an issue. The most common in is using the monitor to select colors for a specific 
printer. Given an adequate model of the color characteristics of the target device, the 
problem reduces to constraining the color selection tools such that it is impossible to specify 
colors outside the gamut of the target device. I[f the gamut of the target device does not lie 
entirely inside the gamut of the monitor some sort of feedback must be provided to inform 
the user that the color displayed does not match the color specified. 


The user still has a choice of using a model that is derived from the color space of the 
device, such as CMY for a printer, or using any other model that can be mapped to both the 
monitor and the printer models. Note that to provide “real-time” interaction it is necessary 
to have an efficient conversion algorithm, one that can produce color value for the monitor in 
less than a quarter of a second. 


We have adequate models for this type of interaction for many of our color printers, all 
our different color monitors, and the NTSC broadcast standard used for videotapes. It is 
important to remember that in this discussion we are considering only the issue of gamut and 
quantization. not other color restrictions such as the restrictions on the rate of change of color 
between adjacent pixels in a videotape or the texturing caused by halftone patterns. 
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We have also experimented with the inverse problem, that is, allowing a user to explore a 
model like the space formed by the X YZ tristimulus values, which does not define a gamut, in 
the context of a particular device like a color monitor. The tristumuls values XYZ are 
mapped into x,y,Y to provide a more familiar representation. The familiar chromaticity 
diagram is used as the tool for controlling the values of x and y. The monitor’s gamut is 
defined as a triangle in the x,y plane and only the region inside the triangle is active, 
automatically constraining the values of x and y. Luminance is controlled separately. Color 
selection is restricted to the gamut of the monitor by constraining luminance. The user can 
explore the surface of the gamut by setting Y to track the maximum value for each point x.y. 
The user can explore the interior of the gamut by disabling the tracking feature. In this mode, 
our current implementation adjusts Y only if necessary to stay inside the gamut. An 
interesting extension would be to allow Y to constrain x.y, making it easy to trace regions of 
equal Y. (we can show a videotape of this also) 


Color design 


A designer sits using the monitor as a sheet of paper or light table. The colortools 
described provide a mechanism for mixing a color and copying it to parts of the design. Since 
color perception is affected dramatically by surrounding colors it is crucial to actually 
integrate the color selection mechanisms with the design system so that colors can be adjusted 
in place. An interesting extension of this is to create a tool that makes it possible to adjust a 
set of colors simultaneously. For any of the color models described it would be 
Straightforward to take a set of colors, extract the red, saturation, Y. or whatever components 
and change them with a slider or knob. The question is, what color models do we really want 
to use to adjust a set of colors? 


There are a set of design issues having to do with color control precision. Unless 
constraints are applied. it is easy to use the color tool to generate a set of similar colors that are 
actually intended to be the same. We have found that a level of indirection, a color palette. is 
a useful method of organizing colors for a design or a set of designs. The designer uses the 
color tool to select a set of colors and name them. In the context of a specific design it is 
possible either to adjust the value assigned to the color name. in which case all the areas 
colored by that name change. or it is possible to adjust a single region and assign a new name. 
An interesting question arises about how to represent the value of the new color. It can be 
defined as a completely independent color value or it can be defined in terms of the 
modification from the old value. For example. imagine a color named “dark red” and define 
it as 30% of a red primary. Create a new color called “darker red”. Should it be defined as 
25% of the red primary. or as 5% darker than “dark red?” 
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Graphical Style 


The example given in the introduction defined the concept of graphical style to be a 
method for representing the rendering characteristics in such an illustration as a set of names 
and rules for rendering. In the example colors were named by their function, such as 
“HighlightColor-1.” There is a wealth of interesting research issues surrounding this type of 
specification. The first step has been hinted at in the example above of “dark red” and 
“darker red” in the idea that a designer should be able to specify colors by their relationship 
to other colors in the design. [t is possible to imagine writing rules involving such perceptual 
terms as contrast, hue, lightness, etc. that would define a set of colors that could be adjusted in 
a manner that could really be called “style”. For example, a design could be rendered in cool 
tones or warm ones, conservative colors or trendy ones. 


Graphical style is crucial for specify designs that are assumed to be used on a range of 
devices and for a range of purposes. Style rules can be used to encode a designer’s choice for 
constraining a design to the limitations of a particular device. By extending the definition of 
color to include textures, even devices that have no color as conventionally defined can be 
accommodated. Graphical style can also be used to encode quality levels and to define the 
best compromises for different costs. 


Conclusions 


This paper has described what is believed to be a prototype of future design 
environments. The designer works with a computer system as a workbench and must 
consider targeting a design towards multiple devices. Here is a framework for addressing the 
problems of color specification and reproduction in this environment. 


It is a crucial first step to understand and calibrate all the devices in the system. Even if it 
isn’t important to adhere to a standard as precise as the CIE system, it is important to be able 
to model the gamut, transfer functions and quantization effects for each device. This will 
enable making the monitor “look like” a particular device. The designer can then manipulate 
colors for the device in a device dependent manner and also use the monitor to proof the 
results for a particular device. As it typically takes much longer to print an image than display 
it, a quick proofing capability can save a great deal of time. 


The device dependent model is not sufficient for designs that are intended to be used 
across multiple devices. For many purposes, however, a device independent set of color 
names will provide an adequate interchange format. Each device can implement its “best 
effort” towards reproducing the names. As we experiment with this approach. it will be 
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interesting to see if it is more important that all the named colors match a standard color, or 
that all the named colors on a particular device maintain a standard relationship. 


The limitation of color names is that for certain designs the named colors do not contain 
the correct colors. Some designers will always want to adjust them. If the design is targeted 
for a particular device, a device dependent set of controls will be adequate. If the design is 
intended to be device independent, it is necessary to develop a device independent model for 
adjusting colors. The standard RGB outlined in the section on standards is one example of 
such a model. Again, it will be interesting to discover what types of parameters are most 
important to someone adjusting the colors in a design. 


The real solution to device independence lies in developing better ways of capturing the 
designer’s purpose for using a color, as outlined in the section on graphical style. 
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The Concept of Style 


Richard J. Beach 
Xerox Palo Alto Research Center 


It is important to observe that there are an incredible number of choices in the design 
parameters that go into producing a document. How do people make the choices? 
What controls the choices? How are the choices communicated when they are made? 


1. Style as a Series of Design Choices 


Many design choices are involved in the process of producing a document. For example, 
the copy editor chooses names for the logical parts of the document and communicates them 
to the graphic designer and compositor on the marked-up manuscript. The graphic designer 
chooses the typographical parameters for these marked parts of the manuscript and 
communicates them to the compositor on type-specification sheets, such as the one shown in 
Figure 1. The compositor acts on the mark-up codes, using the type specifications, and enters 
typographical formatting codes in the typesettable file. 


All of these choices influence the publishing style of the organization. The American 
Heritage Dictionary’s definitions of ‘style’ and ‘style book’ help clarify what style means and 
how it can be used: 


“style n. 1. The way something is said or done, as distinguished from its substance . . . 
7. A customary manner of presenting printed material. including usage, punctuation. 
spelling. typography, and arrangement.” [—. Dictionary] 


“style book n. 1. A book giving rules and examples of usage, punctuation, and 

typography, used in the preparation of copy for publication.” [—. Dictionary] 

Each publishing house develops its own house style. a way of doing things that will 
distinguish documents from that publisher. In the publishing experiment described in One 
Book/Five Ways, the University of Toronto Press provided the most concise set of 
composition style guidelines. covering the following topics: 

text composition: word spacing. word division (hyphenation). letterspacing. paragraphs. 

leading. small capitals. figures (numerals). 

punctuation: dashes. periods. apostrophes. colons. semi-colons. exclamations. question 

marks, ellipses, quotations. 
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Figure 1. TYPOGRAPHIC STYLE SHEET typical of the specifications that graphic 
designers provide compositors to control the parameters of typeset documents. 


special settings: capitals, tables (avoid vertical rules), footnotes, extracts, quotations. 


page makeup: facing pages, widows. 


People at different levels contribute to a publisher’s distinctive style. The editorial staff 
establishes the guidelines for authors and copy editors, such as recommended forms of 
presentation, spelling, language usage, or the avoidance of vertical rules in tables. Graphic 
designers select the typography and layout for book designs. The composition staff 
determines the final typesetting choices through interpreting the typographic specifications. 


A publisher's style is developed through an iterative process. The high level plan is 
established by the publisher and the editorial staff; they request a certain ‘look’ or ‘feel’ for a 
publication. The graphic designer reduces that high level plan into more specific guidelines, 
but the compositor still has some freedom to interpret typographic choices. The result is 
sample pages. These pages are passed up the chain for approval and are returned for 
correction. The changes iterate among publisher, graphic designer, and compositor until the 
publishing staff finally ‘sees’ what they want. For large documents, this leads to 
inconsistencies in how variations not covered in the sample pages are handled, or even 
differences due to different people working on the manuscript. The solution has traditionally 
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been “Try it again until you get it right.” 


For automated composition systems that rely on algorithms to carry out repetitive 
actions, the traditional design process makes it hard to extract the formatting algorithms from 
style guidelines. The guidelines are expressed in terms of what people are doing, rather than 
the process of doing it, or the cause and effect decisions that lead to the result. Therefore, it 
takes several iterations with sample pages that cover all the expected situations before a 
creative programmer can express the style rules as an algorithm. 


2. What Do Styles Affect? 


Style may seem to affect or control more than just the appearance of a document. For 
instance, consider the choice between Canadian and American spelling, something that might 
be treated as a style choice. Clearly different spellings contain different letters, as in “colour 
versus ‘color’, ‘labelling’ versus ‘labeling’, but the same letters may appear in a different order, 
as in ‘centre’ versus ‘center’. The concept of style must accommodate these apparent changes 
in substance. 


We need to realize that style can accomplish changes at many different levels. The 
change in spelling does not affect the meaning of the sentence containing those words, and 
therefore the substance of the meaning remains constant while the spelling varies. In fact, 
many Canadian and American readers easily pass over these different spellings. The style 
may have changed the characters but not the meaning of the words. 


Consider the language processing tricotomy of lexical. syntactic, and semantic analysis. 
Style can be seen to affect primarily the first two stages of analysis. Style at the /exical level 
affects a token’s appearance, such as the choice of spelling. More common lexical style 
changes are the use of distinctive typefaces for section headings. the inclusion of whitespace 
above and below section headings. etc. In fact. most typographic parameters fall into this 
lexical category of style. 


Style at the syntactic level affects the order of information in the document. One 
example is the order of names in a bibliographic citation: one style places the surname before 
initials, while another style places initials before the surname. Another example of syntactic 
style is the placement rule for parts of a document during page layout. such as locating figures 
at the top or bottom of a page and collecting all footnotes at the bottom of each column. 


Style is also possible at the semantic level by providing different readers with different 
views of the document. For instance. a document on how to use the Cedar mail system on a 
new kind of file server [van Leunen. One Document] was prepared for readers with different 
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backgrounds. The document contained written modules of information for one of three kinds 
of audiences: those who had never used the mail system, those who had used the mail system 
but stored their files locally, and those who had used the mail system and had some 
experience with the new file server. A map of which modules applied to which experience 
categories was used to compile three versions of the document from the various modules. 
Cargill presents similar ideas for managing different views of software source code [Cargill, 
Views]. In his scheme, multiple software versions for differently configurable systems were 
maintained in the same file structure. Depending on the configuration desired, different 
software versions would be extracted. 


3. Styles for Specific Media 


Another style dimension is differentiation in media. Traditional printing processes 
provide some variation in colors and papers, but other reproduction technology and electronic 
documents span a broader range of possibilities. Documents that become projection slides, 
posters, or video displays represent some of these. 


The notion of device independence in computer graphics can be applied to document 
formatting. The survey article on document formatting [Furuta, Survey] presents the notion 
of a ‘view’ of a document as the device-independent post-processing of a formatted document 
for a particular device. However, media and device capabilities may influence the appearance 
and readability of information in a document. In this case, device independence is less 
desirable. Rather, we wish to reformat the document to take advantage of device 
characteristics or, put another way, to change the style to suit the medium in which the 
information will be presented. 


Low-resolution devices without color must obviously use different techniques than high- 
resolution color laser printers. Type families are hard to distinguish on low-resolution 
devices: 8-point Times Roman on a display screen is difficult to distinguish from any other 
serif typeface (such as Garamond or Baskerville) because there are so few ‘bits’ available to 
display subtle differences. A color image may lose a great deal when viewed in black and 
white, especially on low-resolution devices that display only a few, if any. grey levels. 
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Graphical Style 
Towards High Quality Illustrations 


Richard Beach and Maureen Stone 
University of Waterloo and Xerox PARC 


ABSTRACT 


If there is to be widespread acceptance of computer generated images in areas 
traditionally served by graphic artists, these images must meet a high standard of 
quality. Document preparation systems are an application area that is gaining maturity 
in providing high-quality computer typeset documents. These systems exhibit a trend 
towards specifying the formatting information for a document separately from the body 
of the text. The goal is to have the document format designed by someone with expert 
knowledge of typography. Writers can then apply a format to their own work simply by 
indicating the semantic content of their text, such as the headings, paragraphs, or 
footnotes. The result is that a writer can produce properly typeset documents without 
learning the esthetics of typography. This paper extends this idea to encompass the 
illustrations in the text. We have developed a prototype system that uses a set of 
graphical style rules to define the design guidelines for the illustrations. The rules, 
called a graphical style sheet, can be used to control a uniform “look" over a set of 
illustrations, or to change the appearance of a particular illustration to reflect different 
publishing styles or different media. The prototype coordinates with an existing 
document preparation system and the combined systems were used to produce this 
paper. We conclude that this is a viable method for controlling image style for at least 
one class of illustrations. This approach contributes to image quality by providing a 
method for capturing knowledge of graphic arts standards, and for ensuring a 
consistent appearance of related illustrations within technical documentation. 


Categories and Subject Descriptors: |.3.3 [Computer Graphics]: Picture/Image 
Generation - Display algorithms; |.3.4 [Computer Graphics]: Graphics Utilities - 
Picture description languages; 1.3.6 [Computer Graphics]: Methodology and 
Techniques - Device independence; 1.7.2 [Text Processing]: Document 
Preparation -Format and notation; languages; photocomposition; J.6 [Computer 
Applications] Arts and Humanities - Arts, fine and performing; 
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Additional key words and phrases: Graphic arts, graphic design, graphical style 
sheet, illustration, integrated text and graphics 


Introduction 


If there is to be widespread acceptance of computer generated images in areas 
traditionally served by graphic artists, these images must meet a high standard of quality. 
Increasingly, we see examples such as chart-making systems or spectacular special effects 
where the quality is defined by the traditional graphic arts standards for print, video or film 
media. This paper describes the development of tools to improve the quality of technical 
illustrations. 


The inclusion of graphic images into computer-typeset documents is an area of current 
research and development, for example, PIC [7], IDEAL [18], JANUS [5], Etude [12], and the 
Xerox Star [10,17]. Typical illustrations which we wish to include are line art and shaded 
images. Frequently the composition systems which create them are text formatters extended 
to handle the higher quality output and flexibility available with typesetters and laser printers. 


Computer graphics has evolved along two fronts towards quality images: the 
introduction of new output devices and the development of new rendering algorithms. New 
devices have higher resolution and more color capability making it possible to render images 
more precisely. New algorithms that generate smooth curves and more realistic shaded 
surfaces provide a way to produce high-quality images. Now, artists and designers can expect 
to find opportunities for creative expression within such systems. 


The style of a document is a phrase that conveys several meanings, all related to quality. 
The word sty/e in a thesaurus refers to the ideas of fashion, method, beauty, class, and 
expression. To a graphic designer, the phrase house style refers to the customary way that a 
particular publishing house handles typesetting or illustrations. 


Traditionally to a graphic designer, a style sheet communicates to the compositor how to 
render a document or image [16]. A style sheet. such as the one in Figure 1, provides 
typographic parameters for typesetting text and guidelines for achieving certain visual effects. 
The purpose of a style sheet is to ensure consistency and design discipline within a project, 
and to provide a rapid and effective means for specifying that discipline. 


Computer typesetting systems have provided style mechanisms for text composition 
through formatting macros or style sheet databases. The “-ms’ macro package supplied with 
TROFF is an example of formatting macros that implement document style [11]. SCRIBE 
uses document types to establish the formatting details for a variety of document styles [15]. 
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Figure 1. TRADITIONAL STYLE SHEET for specifying typographic parameters. The rows 
indicate the parts of the document to be treated specially. The columns indicate the 
typographic parameters which the compositor uses when typesetting this job. Entries in the 
matrix are either checkmarks or numeric values. 


The Xerox Star uses property sheets to select parameters to control the appearance of selected 
items of text and graphics in documents [10]. In each of these systems, some mechanism is 
provided to manipulate the content of a document separately from the appearance of the 
document. Thus consistency and design discipline can be achieved. This 1s accomplished 
without forcing the author to become a graphic designer while the graphic designer can 
supply specialized knowledge to create the document style. 


Extending these formatting style techniques to include graphic images is a natural 
evolution. In the following sections we describe our concept of graphical style for illustrations 
and also the artwork-rendering prototype that we integrated with an existing text composition 
system. The illustrations we consider will be line drawings, although a provision for 
continuous tone images will also be described. 
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SATISFIED BY INITIAL CONDITIONS 


Figure 2. SCIENTIFIC AMERICAN STYLE for illustrations is evident in this one from 
David Waltz’ article ‘Artificial Intelligence’ [19] in the October 1982 issue, page 122. Several 
stylistic aspects can be noted: the thin line weights, the open arrowhead design, the use of 
color and shading, and the caption typography. (Used with permission of W.H. Freeman & 
Co.) 


Examples of Graphical Style 


To motivate our concept of graphical style. we present two examples taken from 
traditional graphic arts productions. The first example presents observations on some stylistic 
aspects of Scientific American illustrations. The second example describes how a consistent 
style was achieved in producing a book having many line drawings. 


Scientific American [Illustration Style 


Scientific American has established a reputation for the clarity and effectiveness of its 
illustrations. In Figure 2 taken from the recent article. “Artificial Intelligence,’ in the October 
1982 issue of Scientific American [19]. we can observe several aspects of the Scientific 
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American style. Lines are generally thin, although with different weights to convey detail, 
arrowheads are always the same open design, lettering is always 8-point Helvetica capitals, 
shades of grey and colors are used only when needed and then only to convey essential 
meaning, and the design is clean and carefully crafted. While the author supplies the concept 
sketch and the illustrator renders the image with great skill, it is the art department that 
ensures that the traditional style of Scientific American illustrations is maintained [4]. 


Traditional Illustrated Book Production 


A recent textbook for introductory computer science co-authored and typeset by the first 
author of this paper [6] required a large number of illustrations. Many of the illustrations 
were listings of computer programs and their output. With a suitable typeface and the text 
files containing the original programs and computer output, these figures were easily 
composed directly into the main body of the book. In contrast, there were over 150 line 
drawings that were hand-drawn by a draftsman. The production of these illustrations 
presented a considerably different problem. 


Illustration guidelines were written to establish the desired style. The authors and the 


~ book designer described how various details were to be handled by the draftsman for each 


type of illustration: mathematical graphs, Pascal syntax diagrams, data structure diagrams, 
and simple line drawings. For instance, the guidelines specified the different line weights for 
the axes and curves in graphs, the typography for labels on graphs and syntax diagrams, the 
Shading technique for areas in graphs and simple line drawings. and the treatment of arrows 
in syntax and data-structure diagrams. These guidelines were organized by illustration 
categories, and then by illustration components. Thus the guidelines for graphs specified the 
treatment of axes, curves, areas, intersection points, axis tick marks. axis labels, and curve 
functions; and the guidelines for syntax diagrams specified the treatment of terminal symbols, 
nonterminal symbols, and grammar rules. Unfortunately, there was no computer support 
available for drawing the illustrations that could produce sufficiently high quality output or 
that could implement the various guidelines. 


The book’s illustrations were also used to produce overhead transparencies for lectures. 
The computer programs and output could easily be reformatted with a larger type size 
suitable for projection by specifying a different text formatting style. The hand-drawn 
artwork had to be enlarged photographically. When an illustration was scaled to transparency 
size, much of the detail in the illustration was often too small to be read in a large lecture hall. 
A style substitution facility for graphical images, similar to the one available for text. with the 
ability to change the proportions of line weights. to use different shading, and to request 
larger. bolder. text captions would have been invaluable. 
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a (at bhyV/2 b 
Trapezoidal Rule for n=! and n=2. 





Figure 3. TRAPEZOIDAL RULE FIGURES demonstrate the use of graphical style to 
produce two very different visual effects. The top illustration is adapted from Computing [6] by 
redrawing it with the Griffin illustrator. The style is faithful to the hand-drawn original. The 
bottom illustration uses the same picture file but with a style appropriate for a 35 mm. color 
slide. The slide has the preferred format with light detail on a dark background, thicker lines 
in white, a larger, bolder and simpler typeface, and the caption is formatted to the slide width. 
(Used with permission from Reston Publishing Co.) 





Graphical Style Sheets 


A graphical style sheet is a way of describing the design guidelines for an illustration. A 
set of figures produced with the same style sheet should “look” related. It should also be 
possible to dramatically change the appearance of a figure by specifying different styles. as 
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shown by the book style and transparency style for the Trapezoidal Rule illustration in Figure 
3. This implies that there is some separation between the style sheet and the specifics of a 
particular illustration. The style sheet should contain rendering information specified in some 
semantic way. For example, there might be a rule that specifies how all axes for graphs 
should be rendered. We need to determine, therefore, how to separate an illustration into 
content versus format, and how to specify the format in terms of rendering attributes. 


Content versus Format in Illustrations 


To apply the content versus format discipline found in text composition systems to 
illustrations, it is necessary to define the content of an illustration separately from its format. 
The illustration’s content is analogous to the author’s rough sketch given to a draftsman, such 
as Figure 4, and its format is analogous to the appearance of the finished artwork rendered by 
the skill and craftsmanship of the artist, for example, Figure 3. The sketch is described by 
geometrical objects and their positioning, while the artwork rendering is described by design 
guidelines and drawing techniques. 


In a manner analogous to the style mechanisms of text formatters, there must be an 
additional means of including semantic notions in an illustration. Just as not all three-space 
indentations are paragraph indents, not all thin lines are axes on a graph. Therefore, 
graphical style must include mechanisms for capturing the intent of the author/illustrator. 


f(x) 


a (atb)/2 b 


Trapezoidal Rule for n=1 and n=2. 


Figure 4. SKETCH OF THE ILLUSTRATION for Figure 3 represents the basic geomety of 
the picture. All the rendering information for the figure has been reduced to drawing thin lines 
and using a typewriter-like typeface. The three examples of the Trapezoidal Rule have all been 
produced by using the same TiogaArtwork file but with appropriate differences in the style 
rules. 
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One way to do this is to provide a level of indirection which names the semantic parts of the 
illustration. The rendering attributes and guidelines associated with these names are defined 
separately in a graphical style sheet. If this level of indirection is available, then it is possible 
to render the same illustration in quite different ways by changing only the graphical style 
definitions. For example, Figure 3 shows two graphical styles with thin, clean lines for a 
typeset book and with wider, bolder lines and colors for a color transparency. 


Rendering Attributes 


A graphical style sheet must express how an illustration is to be rendered. Basic drawing 
attributes supported in most graphics packages are obvious candidates for specifying how an 
artwork rendering program should produce an illustration. Examples of such attributes 
appear in the Griffin illustrator [3], the GKS standard workstation attribute model [2] and in 
the Xerox Star basic graphics feature [10]. These examples suggest that at least line weight, 
line patterns, color specification, and caption typography parameters be included in any 
graphical style sheet. 


Graphic designers frequently rely on mechanical aids and transfer sheets to obtain 
consistent or special effects, suggesting other sources of rendering attributes. A standard 
reference for transfer sheet designs is the Letraset Catalog [1]. Additional rendering 
algorithms can be created to produce some of those effects. For instance, a line in an 
illustration sketch might be rendered by specifying an arrow design in the graphical style 
sheet to be drawn along that line. Similarly, borders or texture patterns might be rendered 
from details provided in graphical style sheets. 


The Prototype System 
Overview 


To experiment with these ideas of graphical style we developed a prototype suitable fora 
class of technical illustrations The system, named TiogaArtwork. was designed to coordinate 
with the Tioga document preparation system and an interactive illustration program called 
Griffin [3]. The procedure for generating a figure is to make a draft using Griffin and then 
convert the figure to a special kind of Tioga document. Once the figure is in document form 
it is possible to adjust both the style and the content using a combination of Tioga and 
TiogaArtwork. All of the figures in this paper, except the published example in Figure 2, 
have been produced using this technique. 
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The TiogaArtwork system was developed in the Cedar programming environment [14], 
which is a research project at Xerox PARC. Cedar is both a language, derived from Mesa 
{13], and a computing environment. The hardware for this environment is a Dorado 
processor [9] with a 1024 by 808 pixel bi-level display and, optionally, a 640 by 480 pixel color 
display. A range of medium- to high-resolution printers is available. The highest-quality 
printers produce digitally half-toned images at phototypesetter-compatible resolutions either 
in black and white or as color separations. 


The design of the prototype is based on features of the Tioga document preparation 
system, so it is necessary to briefly describe Tioga before going into the details of 
TiogaArtwork. The current implementation of Tioga consists of an interactive text editor and 
a batch-oriented typesetter. Documents in this system consist of two files: a file of text nodes 
and a file of formatting style rules. The text nodes in Tioga have a hierarchial structure 
similar to that of the NLS editor [12]. A document is a tree-structured hierarchy of nodes 
containing text, for example, in section, subsection, paragraph order. A normal tree-traversal 
results in the familiar form of the document. Each node in the document can possess certain 
node properties. The principal use of these properties is to associate the formatting style rules 
with the nodes. Figure 5 represents the Tioga document structure of some surrounding nodes 
of the document for this paper. 


The style concept in the Tioga formatter is similar to that in SCRIBE [15]. In Tioga, 
formatting styles are defined by a collection of rules contained in a separate file. The style 
rules specify formatting parameters such as type family, style, and size, indentation, interline 
leading, text justification mode, and composition layout parameters. The formatting rules are 
written in an interpreted language, so it is possible to compute parameters during the 
formatting process. 


We have designed a way to include nodes containing graphics in a Tioga document. 
These nodes contain a textual representation of geometric attributes such as line coordinates. 
curve control points, positions, and transformations. The style rules associated with these 
nodes specify graphical parameters and rendering attributes. Tioga’s node properties are 
implemented as named property lists, so we defined a new property. which we called 
ArtworkClass, for the illustration nodes. The style machinery for Tioga is extensible. so 
the current set of tools allows us to build on the existing mechanisms for manipulating styles 
and for adding graphical style attributes. It also means that our system can use Tioga facilities 
for text formatting. 


A document in our system. therefore, is a tree of nodes arranged in a hierarchical 
Structure. Some of the branches of the tree represent paragraphs of text and some represent 
illustrations. An illustration is a subtree in the document tree-structure with its root node 
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Document 


head] StyleRule 


The Prototype System 


head2 StyleRule 


Overview 


paragraph StyleRule 


To experiment with these ideas .. . 


paragraph StyleRule 


The system was developed... 


head2 StyleRule 


Geometric Description 


paragraph StyleRule 


A Tioga document is a tree... 


item StyleRule 
x y .translate -- to translate... 


head StyleRule 


Conclusions 





Figure 5. DOCUMENT STRUCTURE provides a hierarchy for organizing a text document. 
This illustration represents the Tioga document structure for the nearby sections and 
subsections of this paper. The boxes represent text nodes and the labels above each box 
represent the node properties. The StyleRule property indicates the formatting attributes for 
first-level headings, second-level headings, paragraphs, and items within a list. 














possessing an ArtworkClass property. Nodes within the subtree structure may be either 
graphics or text. Nodes representing subpictures will have the ArtworkClass property, 
while text captions within an illustration will have only the standard Tioga properties. This 
recursive relationship is important; it means that our system can use all the text formatting 
features of Tioga for text inside of illustrations. Figure 6 represents the Tioga document node 
structure for portions of the Trapezoidal Rule illustrations in Figures 3 and 4. 
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Trapezoidal Rule Figure 


ArtworkClass = ArtworkNode 

ArtworkClass = ArtworkNode 

position axis 
ArtworkPath, axis StyleRule 
draw y -axis 

ArtworkClass = ArtworkNode 
ArtworkPath. axis StyleRule 

ArtworkClass = ArtworkNode 


position curve | 


ArtworkPath. curve StyleRule 


draw curve 


ArtworkClass = ArtworkNode 


position urea 


ArtworkPath. areal StyleRule 


draw area 


ArtworkClass = ArtworkNode 


position label 


rightCaption StyleRule 


ArtworkClass = ArtworkNode 


position label _ 


centeredCaption StyleRule 


ArtworkClass - ArtworkNode 


position label 


| leftCaption StyfeRule 
(x) 





| Figure 6. ILLUSTRATIONS in TiogaArtwork also use the Tioga document structure. The 
| boxes represent the artwork nodes which contain the geometrical representation, and the 
labels above the boxes represent the node properties. Note that StyleRule properties exist on 
both text and artwork nodes. This structure can be extended to encompass pictures composed 
) of many subpictures and text captions in a natural hierarchy. 


TiogaArtwork is the part of the system that interprets the graphical nodes and style rules. 
The pictures can be previewed on a display screen using the Cedar graphics package [21] or 
converted for printing. 
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The combination of using the Tioga document structure and representing the graphics as 
text allows TiogaArtwork to interact easily with the Tioga editor and the Tioga typesetter. 
This is advantageous in a number of ways. One advantage is the ease of creating and editing 
figures. The Tioga editor does not react to the ArtworkClass property, so the text and 
style properties for a graphics node can be edited in the normal manner. This means that for 
the prototype experiment we did not have to write a special editor to manipulate the graphics, 
although we did write a conversion routine which translated from Griffin format to Tioga 
node structure and automatically generated style rule properties from the Griffin style 
attributes. 


Another advantage of using the Tioga document structure for illustrations is that it 
permits us to typeset any text in the illustration. This is accomplished by setting up a call-back 
mechanism between the typesetter program and TiogaArtwork. As the document tree is 
traversed, the typesetter formats text nodes in the normal fashion. Whenever a node with the 
ArtworkClass property is encountered, that entire subtree is passed to TiogaArtwork. If 
TiogaArtwork subsequently encounters a text node while rendering the illustration, it passes 
the text branch back to the typesetter. This recursion is guaranteed to terminate at the end of 
the tree-path traversal. 


The call-back mechanism is also used to pass dimensional information between the two 
programs. The typesetter provides the dimensional parameters for the formatted text with 
which TiogaArtwork can layout the caption within the illustration according to the style rule. 
TiogaArtwork generates the dimensions of the formatted graphics so the typesetter can layout 
the figures with the paragraphs on the page. 


This document structure and system design gives us a great deal of flexibility for 
experimenting with the best way to represent illustrations in a document. The current 
structure puts only geometric information in the document body and puts all rendering 
parameters in the style rules. We found that this representation gives a good semantic 
description of the picture. The next sections will describe our current implementation in 
more detail. 


Geometric description 


A Tioga document is a tree of nodes. Some of the branches of the tree represent 
paragraphs of text, and some represent illustrations. For text, the levels of the tree represent 
the section-subsection-paragraph hierarchy of the document. For illustrations, the hierarchy 
represents a relative set of transformations for subpictures. Each subpicture in the illustration 
is drawn relative to the coordinate system established by its ancestors. An ArtworkClass 
node, therefore. may contain a set of coordinate transformations which establish the 
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coordinate system for this node and all of its descendant subpictures: 
x y .translate -- to translate the origin to <x,y> 
SX sy .Scale --toscale by x.y scaling factors 


r .rotate -- to rotate by r degrees 


A geometrical shape can be composed of straight lines and curves, and a sequence of 
these lines and curves is called a path [21]. A path can represent a line, an area or a clipping 
region. The path definition is provided by commands to draw lines and curves: 


xX y .moveto -- establish the current path position <cx,cy> as <x, y> 


x y .lineto -- draw a line from the current position to <x.y>, and reset <cx,cy> to 
<x,y> 

x1 yl x2 y2 x3 y3 .curveto -- extend the path with a curve which has the four 
Bezier control points <cx,cy>, <x1,y1>, <x2,y2>, and <x3,y3>, and reset <cx,cy> 
to <x3,y3> 


For the kind of pictures we are working with, the paths and transformations are all that is 
specified in the nodes of an illustration document. The rest of the rendering information will 
be supplied by the style rules. Figure 7 is an extract from the geometric description used for 
the three instances of the trapezoid rule diagrams in Figures 3 and 4. 


For reasons which become apparent during the discussion of rendering algorithms, it is 
necessary to distinguish between transformations and path definitions in the contents of an 
ArtworkClass node. It is also convenient to create artwork nodes which specify the file 
name of a TiogaArtwork illustration. Continuous-tone images stored as files are another 
category of illustration that can be accommodated by the ArtworkClass property. The 
values of the ArtworkClass property are the following: 


ArtworkNode -- node contains the textual representation of an illustration which is not 
a path, such as transformations 


ArtworkPath -- node contains the geometric definition of a path 


ArtworkImage -- node contains the name of a continuous-tone image file stored as an 
atray of intensity samples 


ArtworkFileName -- node contains the name of another TiogaArtwork file 
Basic style parameters 

Style rules are described in an interpreted language with each format rule expressed as a 
procedure definition. A typical rule describes a list of parameter-value pairs which will be 


stored in a global association list. The general format for this in the Tioga editor is: 
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% TiogaArtwork figure for Trapezoid Rule 
% Cluster 1 
00 .translate 1 1 .scale 0 .rotate 
% y-axis | 
11 31 .translate 1 1 .scale 0 .rotate 
1 1 .moveto 1 185 .lineto 
% x-axis 
3 39 .translate 1 1 .scale 0 .rotate 
1 1 .moveto 249 1 .lineto 
% curve 
27 87 .translate 1 1 .scale 0 .rotate 
] 1 .moveto 
817 15 33 25 49 .curveto 
4378 71106 105 113 .curveto 
131 118 161110 185 97 .curveto 
19492 20187 209 81 .curveto 
% area from a to(a+b)/2 
51 39 .translate 1 1 .scale 0 .rotate 
1 1 .moveto 1 97 .lineto 
81 161 .lineto 81 1 .lineto 
1 L.lineto 
% area from (a+ b)/2 to 


% y-axis label 

8 216 .translate 1 1 .scale 0 .rotate 
y 

% x-axis label 

247 36 translate | 1 .scale 0 .rotate 


X 


Figure 7. GEOMETRIC REPRESENTATION of an illustration in a textual form consists of 
comments, transformations, and path definitions. The Trapezoidal Rule illustration was first 
drawn with the Griffin illustrator [3]. It was then automatically converted into a TiogaArtwork 
document generating the node structure and node properties from the Griffin illustration file. 
The indentation indicates the node structure. The comments, which begin with percent signs, 
were added manually by editing the document text. 
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(name-of-style-rule) “commentary describing the style" { 
value parameterName 
value parameterName 


value parameterName 
} StyleRule 


The name-of-style-rule refers to some semantic aspect of the illustration such as 
axis, curve, or nonterminal. Values are either numbers or keywords and may be expressions. 
Values which are distances may be expressed in most convenient units. For instance, 2-point 
line weight can be expressed as 2 pt, colors can be expressed by keyword color names such 
as red, darkBrown, or lightBlue, and relative colors can be evaluated as some 
percentage (such as 75 percent) of the brightness or saturation of a named color. 


The following basic set of drawing attributes were defined as graphical style parameters: 
lineWeight the line thickness 
pathType the path area/outline type: filled, outlined, filled+outlined 
penType the pen shape: round, square, rectangular. elliptical, italic 
penHeight _ the pen height as a proportion of 1 ineWeight 
penWidth the pen width as a proportion of 1 ineWeight 
penAngle the rotation of the pen, in degrees from horizontal 
areaColor _ thecolorof filled areas: hue, saturation, brightness 


outlineColor _ thecolor of outlines: hue, saturation, brightness 


The following set of style attributes are supplied by the existing formatter but are 
interpreted by the artwork-rendering software for illustration captions and labels: 

family the type family, such as Helvetica or TimesRoman 

size the type size 

face the type style: regular. italic. bold. and bold+italic 

captionFormat the text justification mode: flushLeft, flushRight. centered. or 
justified 

CaptionAlign _ the vertical text justification mode: f TushTop. centered, baseline, 
or flushBottom 

lineLength the length of caption lines 

leftIndent the left indent for captions 

rightIndent the right indent for captions 

leading the spacing between lines 


textRotation _ the rotation of the text line. in degrees from horizontal 





textColor — the color of caption text: hue, saturation. brightness 
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To demonstrate both the style language and the graphical style attributes, Figure 8 shows 
the two style definitions necessary for Figure 3. 





% TrapezoidBook.Style 
BeginStyle 
(BasicGraphics) AttachStyle 
(BasicText) AttachStyle 


(axis) “x,y axes" { 
black outlineColor 
outlined pathType 
1 pt lineWeight 
} StyleRule 


(areal) “dark areas” { 
grey areaColor 
filled pathType 
} StyleRule 


(area2) “light areas” { 
lightGrey areaColor 
filled pathType 
} StyleRule 


(curve) “function line” { 
black outlineColor 
Outlined pathType 
2 pt lineWeight 
} StyleRule 


(leftCaption) "..." { 
“TimesRoman” family 
8 bp size 
italic face 
flushLeft captionFormat 
flushTop captionAlign 
0 leftindent 
} StyleRule 


(centeredCaption) "...” { 
(rightCaption) "..." { 


EndStyle 


% TrapezoidSlide.Style 


BezinStyle 
(BasicGraphics) AttachStyle 
(BasicText) AttachStyle 


(axis) “x.y axes" { 
white outlineColor 
outlined pathType 
2 pt lineWeight 
} StyleRule 


(areal) “dark areas” { 
orange areaColor 
filled pathType 
} StyleRule 


(area2) “light areas” { 
light Yellow areaColor 
filled pathType 
} StyleRule 


(curve) “function line” { 
white outlineColor 
outlined pathType 
4 pt lineWeight 
} StyleRule 


(leftCaption) ™..." { 
“Helvetica” family 
12 bp size 
bold face 
flushLeft captionFormat 
flushTop captionAlign 
0 leftindent 
white textColor 
} StyleRule 


(centeredCaption) "...” { 


(rightCaption) "..." { 


EndStyle 





Figure 8. GRAPHICAL STYLE SHEETS for the two Trapezoidal Rule illustrations in 
Figure 3 demonstrate the style language and the graphical style attributes. The style on the 
left produces a typeset book quality illustration and the style on the right produces a colored 
35 mm. slide form. Note that the styles differ in the line weights, color selections, and 
typography parameters. 
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Rendering the illustration 


TiogaArtwork translates our representation of an illustration into a set of calls on the 
Cedar graphics package [21]. The graphics package implements a full set of transformation 
and clipping operations. Shapes are described as a set of analytical outlines that are filled 
either with a flat color or an image. 


The geometry in our documents consists of paths and transformations. The 
transformations translate one-for-one into calls on the graphics package. The path definitions 
are compatible with the outline description required by the graphics package. Thus, if a node 
contains a path with the pathType= filled, then the path represents a closed area to be 
colored with areaColor and can be easily rendered. 


If the pathType is out1 ined, then it represents the center-line of a line or pen stroke. 
A thick line is drawn along this path as defined by the 1ineWeight and pen parameters. 
Predefined pen shapes are round, square, rectangular, elliptical, and italic. Other style 
parameters control the aspect ratio and rotation of the pen shapes. The graphics package does 
not currently support a pen semantic, so TiogaArtwork reduces this description to an outline. 
This definition of a thick line is similar to that of a stroke in Metafont [8]. 


Extended style semantics 


This basic graphical style machinery can be extended to express more complicated style 
semantics. The following examples of shadows, arrows. and borders are based on common 
graphic arts practice. 


Shadow Styles 


Two-dimensional shadow effects can be created to emphasize an object. Two simple 
examples of shadow effects are drop shadows and offset shadows, shown in Figure 9. A drop 
shadow appears to give an object depth by extending a slanted shadow line away from the 
object. An offset shadow gives emphasis by showing an underlying copy of the object a short 
distance away. The style parameters for shadow effects are the following: | 

shadowType the shadow effect: drop or offset 

shadowPathType _ the offset shadow path type: filled, outlined, 
filledtoutlined 

shadowAngle the angle of the shadow from the object. in degrees 

shadowOffsetAmount _ the distance that the offset shadow is placed at shadowAngle 

shadowDirection the direction of the shadow from the object: upLef t. upRight. 
downLeft. downRight 
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Figure 9. SHADOW EFFECTS can be rendered by a simple algorithm that reuses the 
geometric path definition several times. The arrow on the left has a drop shadow and the 
arrow on the right has an offset shadow. Several style parameters are available to control the 
weight, angle, color, and type of shadow. 


shadowWeight the weight of the drop shadow or outline of the offset shadow 
shadowAreaColor the color of the drop shadow or offset shadow area 


shadowOutlineColor  thecolor of the offset shadow outline 


Both shadowing techniques can be rendered by first drawing the shadow of the object 
and then drawing the intended object. Slightly darker shadow colors can be computed in a 
style rule, for instance, shadowAreaColor might be computed as 50 percent of the 
brightness of the areaColor. 


Arrow Styles 


Arrows in drawings can be described as paths having a particular arrow style. The style 
must describe the shape of the arrowhead and tail. One approach is to define a prototype 
shape on a simple rectangular grid. This model resembles a transfer sheet system such as the 
one provided by Letraset [1]. To avoid cataloging a large variety of curved arrows, a mapping 
is used to stretch the prototype shape along any given path. The x-axis of the grid is mapped 
onto the path directly, and y-values are mapped into distances normal to the path [20]. This 
scheme permits considerable scope for creative designs by using a simple and easy-to-draw 
prototype, and then computing the hard-to-draw finished design as in Figure 10. 


Border Patterns 


Arbitrary patterns are a logical extension of the style for arrow heads. These border 
patterns are slightly more complicated to render because the prototype pattern must be 
repeated along the path like a wallpaper design. As only an integral number of repetitions is 
desired, the mapping algorithm must adjust scale factors to ensure an exact number of cycles. 
Again, the same simple prototype grid is mapped along the path. as shown in Figure LI. 
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Figure 10. ARROW STYLES can be created by mapping a prototype design, developed on a 
rectangular grid, along a curved path. The mapping algorithm can be controlled to preserve 
the arrow head and feather shapes and only stretch the shaft of the arrow. The path can be any 
combination of lines or curves. The style of the mapped arrow can be filled or outlined in a 
similar way to other TiogaArtwork objects. 


Conclusions 


We conclude that a representation that explicitly specifies the stylistic properties of an 
illustration provides a powerful way to control document quality. Graphical style sheets can 
provide design discipline. Using a common style when designing a set of illustrations for a 
paper, for example, can guarantee a uniform specification of such parameters as color, line 
styles, and arrowhead shapes. Another benefit is that the additional semantic structure 
introduced by the style sheet makes it possible to produce those figures over a range of 
circumstances. Different style sheets can be designed to accommodate not only different 





Figure 11. BORDER PATTERNS can also be mapped along paths using the same technique 
as for the arrows, but using an integral number of repetitions. This Greek Key design consists 
of a white rectangular area and a blue spiral. The design was created to fit together end to 
end. The path is a cubic curve. 
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publishing styles but also the restrictions inherent in different media, such as the difference 
between paper and 35 mm slide media. If it is easier to reuse figures, then perhaps more time 
will be spent producing good ones. 


We have successfully implemented a prototype system for manipulating graphical styles 
which was built to coordinate with the Tioga document preparation system. We have used 
the combination of Tioga and TiogaArtwork to produce this paper. We conclude from 
working with this system that it is possible to control some of the stylistic aspects of 
illustrations using a set of style rules, and that this approach is a useful one for specifying 
pictures in documents. The particular representation we used was a convenient one for our 
environment. The important aspect is the separation of the geometry and the style properties, 
especially the level of indirection introduced by using named style rules. The flexibility 
inherent in the textual representation and the style language was important in an 
experimental system. We suspect that a style language is the correct level of abstraction for 
graphical styles even in fully developed systems. 


Representing illustrations as styles plus geometry introduces some interesting issues with 
respect to rendering algorithms. For example, one natural specification for line style is to 
specify a uniform width. The specification should also describe the behavior of the line at 
joints and endpoints; for example, whether the corners are mitered or rounded off. An 
algorithm that produces an analytical description for the shape of an outline specified in this 
manner is nontrivial, especially if the path contains parametric cubic splines. This kind of 
example forces us as graphics system designers to pay attention to useful rendering 
algorithms, not just convenient ones. 


Future Work 


It is clear that for many applications one should use an interactive graphics system to 
design an illustration. For example, we used the Griffin system to generate the illustrations 
for our paper rather than manually calculate path descriptions. There are many interesting 
questions about how graphical style should be specified in such a system. Griffin assigns 
attributes to shapes, but there is little semantic content to the assignment; all objects with the 
same set of properties have the same style. 


In general, we need to learn more about organizing style rules. Our current 
implementation has a different rule for each node that looks at all different. Often, however, 
the styles vary in only a few parameters. In fact. the actual semantics may really be: these 
nodes are the same except for these parameters. In the current TiogaArtwork system, it is 
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possible to define generic style rules, that can be referenced in specific style rules to specify 
the common set of attributes. Tools are needed to provide an effective interface to this style 
machinery. 


Illustrations are often produced by executing other graphics programs. Frequently these 
programs have little facility for providing graphical style. Many times the illustrations 
produced must be redrawn or the programs fine-tuned to generate publication-quality results. 
Some mechanism for capturing the images produced and for supplying graphical style 
semantics would be most helpful in incorporating such illustrations in documents. 


Another interesting topic is style guidelines which apply to the layout and design of 
illustrations rather than to simple rendering parameters. It can be convenient, for example, to 
express box dimensions as a function of the size of the text in the box. The size of the text 
depends on its font style, so there needs to be some way of specifying this relationship 
between the style and the geometry. In another example. the actual endpoint of an arrow 
changes when. it is pointing to something which is rendered with a thick line than a thin one. 
Style rules might also include document layout parameters. For example. the style of an 
illustration could control the layout so that the figure might have a horizontal orientation 
when the document is typeset with wide columns, a vertical one when narrow columns are 
used, or a fixed aspect ratio to suit videotape or slides. A dynamic reconfiguration capability 
would be helpful in making the illustrations look better in the chosen layout. 
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ABSTRACT 


A synthesis algorithm based on composition is described that automatically generates 
a wide range of graphical presentations, such as bar charts, scatter plots, and 
connected graphs. It designs presentations by selecting a design from a set of 
primitive designs or by composing primitive designs in such a way that the input 
information is expressed correctly and effectively. The synthesis algorithm, which is 
implemented using logic programming techniques developed in artificial intelligence 
research, is the primary component of a prototype application-independent 

presentation tool called APT (A Presentation Tool). Tools such as APT can be 
incorporated into application-specific user interfaces to generate presentations that 
exploit the structure of the input information and the capabilities of the output medium. 


/ 


1. Introduction 


The use of computers to develop illustrations for technical publications raises a range of 
graphic arts issues that must be addressed before authors that are not experts in graphic arts 
and computer graphics can easily generate effective high-quality illustrations. These issues 
range from detailed rendering problems, such as the design of digital fonts and device- 
independent imaging models, to general graphic arts problems, such as the effective 
utilization of a given medium. The current situation is that it is now possible for authors to 
render high-quality illustrations using computers. Furthermore, illustrating programs are 
making it easy for the author to edit computer-based illustrations. However. current systems 
require that authors have graphic arts and computer graphics expertise if they want to 
generate effective high-quality illustrations. Effective illustrations have the property that their 
rendering. layout. and design details work together to communicate information. An author 
developing or modifying an illustration needs assistance to make sure that these graphic arts 
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issues are addressed. Research is beginning to focus on the development of computer systems 
that provide such assistance. Ultimately, computer systems will automatically create effective 
illustrations for the information given to them by authors. 


This paper describes a prototype application-independent presentation tool called APT 
(A Presentation Tool) that automatically transforms symbolic data into an appropriate 
illustration, such as a bar chart, a scatter plot, or a connected graph. In this case, the author is 
an application program that wants to communicate some information to a user. The 
relationship between an application program and a presentation tool is shown in Figure 1. 
The application extracts data, expressed as a set of relations, from its database (perhaps using 
Statistical analysis). The presentation tool then synthesizes a design and renders an image that 
presents this data. The novel feature of APT is its ability to compose a wide range of designs, 
thereby accommodating a wide range of input data. It can also exploit the capabilities of a 
variety of output media, such as color and monochrome media. The ultimate goal is to 
develop a tool that can be used by application designers who are not experts in graphic arts or 
computer graphics to ensure that their user interfaces generate correct and effective 
presentations. 





Presentation Tool 


Application 


extract 


synthesis render 
database —~ data— | ———"> design —~— image 





Figure 1: Generation Process Model. 





2. Existing Research on Automatic Graphical Presentation 


The automatic design of graphical presentations of information is a relatively unexplored 
area. The existing research, which has focused on various aspects of the problem, can be 
classified along four major axes: 1) the graphical techniques utilized, 2) the range of designs 
generated, 3) the consideration of graphic arts issues, and 4) the consideration of dialogue 
issues. The graphical techniques axis focuses on the use of 2D position, 3D position, and 
animation. Illustrations in technical publications do not involve animation. The range of 
design axis identifies the ability of a system to accommodate a wide range input information 
by being able to generate a wide range of designs. The graphic arts axis identifies research 
that address the effectiveness of presentations. The dialogue axis identifies research that 
addresses the linguistic issues involved in choosing the information to be presented. such as 
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automatically choosing the information to be presented or developing a sequence of 
presentations. Given these axes, the existing research on automatic graphical presentation can 
be succinctly described. 


The AIPS system, which is one of the earliest attempts to separate presentation from the 
rest of an application, refines high-level specifications of 2D information displays [21]. The 
key idea is to use the KL-ONE representation system to match information with templates 
that describe information displays. Once a match is found, procedural attachments in the 
template generate the information display. This research differs from more recent research on 
User Interface Management Systems in that it focuses directly on the semantics of graphical 
presentations. The graphical techniques involve 2D position; only a single display appears to 
be generated by their prototype: graphic arts and dialogue issues are not addressed. 


Although Kahn’s exploratory research captured in his ANI and DIAGRAMMER 
systems is difficult to classify, it is important because of the questions it raises [11]. ANI, 
which generates a script of a 2D positional animation from a natural language description of a — 
story, asks how information should be communicated with animation. Dialogue issues are an 
important concern of this research. DIAGRAMMER, which uses a knowledge-based 
approach and user advice to generate 2D node-link diagrams, asks what are the conventions 
involved in the layout of node-link diagrams and how should advice be used to modify a 
diagram. 


Gnanamgari's BHARAT system is an early effort at the automatic generation of 2D 
presentation graphics [10]. It selects a pie chart, bar chart, or line chart design for a single 
unary function; the function can have multiple ranges. each of which must be numeric. 
Continuity of the function causes a line chart design to be used. Indication that the range sets 
can be summed to a meaningful total causes a pie chart design to be used. Bar chart designs 
are used for the remaining cases. Although multiple designs are used, the range is limited: 
input that contains multiple relations and non-functional relations is beyond the scope of this 
early research. Graphic arts issues are discussed; the rendering code uses fonts and colors in 
an effective manner. Dialogue issues are not addressed. 


Friedell’s VIEW system automatically generates 2D iconic descriptions of a database 
contents by the stepwise refinement of object templates [9]. Following the AIPS research, he 
also uses the KL-ONE representation system to describe the templates. The main focus of the 
research is on dialogue issues and an efficient data structure for panning across a large image. 
The dialogue issues are addressed by the stepwise refinement algorithm that can terminate 
when sufficient detail is generated for a given presentation situation. Graphic arts issues are 
not the focus of the research. 
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Beach has automated the low-level layout and design of tables whose high-level topology 
is specified by the user as a matrix of rows and columns [1]. Because the graphical style 
properties of the table, such as the typographic rules, background tints, and size constraints, 
are explicit, the user can control parts of the graphic design while the remainder of the 
graphic design is controlled by the existing default style. Explicit graphical style also makes is 
possible for the system to format the table in different ways for different media. However, 
there is no computer assistance for developing effective graphical styles. Dialogue issues are 
not considered. 


More recently, Feiner’s APEX system has addressed the problem of automatically 
generating a Sequence of static images that describe actions in a 3D world [7]. The primary 
focus of the research is on building a system that automatically generates an effective 
sequence of illustrations. Irrelevant or redundant details are automatically recognized and 
omitted. For example doors that have been seen before or door handles that are not part of 
the action being described. Useful details are also recognized and included. For example, 
landmarks that describe the location of an object or the supporting floor. The graphic arts 
issues surrounding the merging of icons, such as arrows, and images of 3D objects are also 
considered. A 3D model obviates concern about the design of the illustration. 


The research described in this paper differs from the previous work in that it focuses on 
the generation of a comprehensive range of 2D static presentations of information. Graphic 
arts issues are captured in expressiveness and effectiveness criteria. Composition is used to 
generate design alternatives so that a wide range of input can be accommodated. Dialogue 
issues are not considered. 


3. Graphical Presentation Problem 


The graphical presentation problem is to express the input data and its structural 
properties effectively, given the capabilities of the output medium. This problem can be 
broken into three major subproblems: 1) accommodating a reasonable range of input, 2) 
making sure the design expresses a given input, and 3) making sure the design generates an 
effective presentation. 


The structural properties of database relations determine the range of input that must be 
accommodated by a presentation tool that generates presentation graphics. A rough estimate 
of the size of this range is given by the following calculation. Given r relations, each of which 
has d domain sets, the number of possible inputs is at least 


d’ X (dr)! X 3%. 
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The d’ factor indicates that each relation can be a functional dependency from zero or more 
domain sets to the remaining domain sets of the relation. The (dr)! factor indicates the 
number of unique ways the domain sets can be shared. The 3% factor indicates that each 
domain set can have one of three types: nominal when the set is a collection of unordered 
items (e.g. {Jay Eagle, Robin}), ordinal when the set is an ordered tuple (e.g. <Monday, 
Tuesday, Wednesday>), and quantitative when the set is a range (e.g. [24,273]). 


The preceding estimate indicates that a presentation tool must be able to generate a wide 
range of designs to accommodate the potential input from an application. For example, the 
example set of four binary relations used in the major examples in this paper is one of over 4 
billion possibilities that can be given to the presentation tool; each of these possibilities is a 
different design problem. 


Given the ability to generate designs, the next problem is to make sure that a design 
alternative expresses exactly the input data: that is, all of the input data and only the input 
data. Expressing additional information is potentially dangerous, because it may not be 
correct. For example, all of the data in a set of unordered values can be expressed by a design 
that associates a point of unique size with each value in the set. However, since the sizes of 
points are ordered, the design also states that the input data is ordered, which is incorrect. 


To be useful, a presentation tool must also generate effective designs. Effectiveness can 
be judged in several ways. For example, a design can be judged to be effective if it can be 
interpreted accurately or quickly, if it has visual impact, or if it can be rendered in a cost- 
effective manner. I concentrate on generating designs that can be accurately interpreted. 
Dealing with multiple, perhaps conflicting. evaluations of effectiveness is beyond the scope of 
this research. 


Accuracy of interpretation is determined by the capabilities of the human visual system, 
which are not yet well understood. In the meantime, presentation tools should utilize 
conjectural theories of human perception to estimate how accurately the graphical 
relationships in a given design can be interpreted. An accuracy evaluation should also be 
sensitive to the properties of the output medium, which determines whether a graphical 
relationship can be rendered and the effectiveness of the rendering. | 


4. Approach 
A design is the association of graphical relationships. such as position and color, with the 
input relations and their structural properties. such as a set of values that is ordered. Figure 2 


shows graphic designer Jacques Bertin’s vocabulary of the graphical relationships commonly 
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Objects: points, lines, and areas 
Positional: 1D, 2D, and 3D 

Temporal: animation 

Retinal: color, shape, size, saturation, 


texture, and orientation 


Figure 2: Bertin’s Graphical Objects and Graphical Relationships 





used in presentation graphics [2]. Graphic designers use graphical objects, such as points, 
lines, and areas, to encode information via their positional, temporal, and “retinal” 
properties. 


A wide range of designs is required to accommodate the wide range of potential input. 
Given the cross-product of the possible inputs and the graphical vocabulary described in 
Figure 2, a simple list of alternatives will clearly be cumbersome to implement. Furthermore, 
there is no guarantee that a list of alternatives will be comprehensive unless it is generated in a 
principled manner. The approach described in this paper addresses these problems. It is 
based on the observation that complex designs are compositions of simple designs. That is, 
given a set of primitive designs that capture various graphical encoding techniques and some 
composition operators, it is possible to generate a wide range of design alternatives. This 
observation leads in a natural manner to a synthesis algorithm, which is based on a divide and 
conquer strategy: 


— The partitioning phase divides the information into manageable components. 


— The selection phase chooses an appropriate design for each of these components from 
a collection of primitive designs. 


— The composition phase applies composition operators to the individual designs to 
unify them into a single presentation of the information. 


The synthesis algorithm uses expressiveness and effectiveness criteria to search among 
alternative partitionings, primitive designs, and composition operators for the best design. 
The expressiveness criteria reject alternatives that are incorrect. The effectiveness criteria 


1 The “retinal” properties are so-called because the retina of the eye is sensitive to them independent of the 
position of the object. Although they are included in the list of encoding relationships. 3D position and animation 
are beyond the scope of this research. 
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determine which alternatives should be considered first. One advantage of the synthesis 
algorithm is that the effectiveness criteria can be based on the capabilities of the output 
medium. If the medium includes color, the synthesis algorithm can use color in its design. If 
the medium does not include color, it must use something else. 


The remainder of this paper focuses on the synthesis algorithm. The expressiveness and 
effectiveness criteria are described but their details can be found elsewhere [14]. The next 
section describes the input/output behavior of the synthesis algorithm on a sample problem. 
The following three sections, which refer to this problem, describe the three phases of the 
synthesis algorithm: partitioning of information, selection of primitive designs, and 
application of composition operators. The final sections discuss media sensitivity, the 
prototype implementation, related research, and future research directions. 


5. Sample Input/Output Behavior 


The input to the presentation tool consists of a set of relation tuples to be presented and 
the structural properties of those relations. For example, a typical application is a database 
question-answering system. Figure 3 illustrates the tuples that might be extracted in response 
to a question about cars. These tuples describe the four relations Price, Mileage. Weight, and 
Repair, which have the structural property that they map the cars to their corresponding 
values. The structural properties of these relations, which are normally stored in the database 
schema, are shown in Figure 4 using standard database notation [19]. The structural 
properties are the primary input to the synthesis algorithm. 


The presentation tool produces two outputs: a design and an image rendered from the 
design. A design, which is the primary concern of this paper, consists of a set of encoding 
relations between the graphical objects and the information. For example, Figure 5 describes 


Price (accord,5799) Price (amc-pacer.4749) 
Mileage (accord,25) Mileage (amc-pacer, 17) 
Weight (accord,2240) Weight (amc-pacer,3350) 
Repair (accord.Great) Repair (amc-pacer, Terrible) 
Price (audi-5000,9690) Price (bmw-3201.9735) 


Figure 3: Relation Tuples About 1979 Automobiles 
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Price : Cars — [25,12000] 

Mileage : Cars — [4,40] 

Weight : Cars — [26,5000] 

Repair: Cars — <27reat,Good,OK,Bad,Terrible> 


Cars = {accord,amc-pacer,audi-5000,bmw-320i, . . .} 
Figure 4: Structural Properties of the Automobile Relations 


a scatter plot design for expressing the automobile relations. Graphical objects, such as points 
and line segments, encode the domains of the relations; properties of those objects, such as 
position, color, and size, encode the functional information. The image rendered by APT 
from this design is shown in Figure 6. 


6. Partitioning Sets of Relations 


The synthesis algorithm generates a unified design of the input information by 
composing primitive designs, each of which encodes part of the input information. The 
partitioning phase of the synthesis algorithm divides the set of input relations into 
manageable subsets, each of which can be encoded by a primitive design. For example, the 
set of automobile relations 

{ Price. Mileage. Repair. Weight} 


Encodes (VertAxis,[25,12000]) 

Encodes (HorzA xis,[4,40]) 

Encodes (Points,Cars) 

Encodes (Position(Points, VertA xis),Price(Cars)) 
Encodes (Position(Points, HorzA xis), Mileage(Cars)) 
Encodes (Color(Points),Repair(Cars)) 


Figure 5: Scatter Plot Design for Automobile Relations. The Encodes relation maps graphical 


objects, such as a vertical axis, to the input information, such as the range of prices. The input 
relations are written as functions to simplify the description. 
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Mileage 


Car price for 1979 3A TERRIBLE 
Car mileage for 1979 Sa we 
Repair record for 1977 

Car weights for 1979 





Figure 6: Rendered Scatter Plot for the Automobile Data. The disks in this figure are 
intended to have an ordered range of colors from green to red through gray. The letter g 
indicates a green color and the letter r indicates a red color. The combination of letter and 
gray indicates a grayish color. This convention makes it easy to distribute copies of this paper. 
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does not match any of the primitive designs described in the next section. Therefore, the 
input is partitioned into two subsets: 


{ Price} and { Mileage, Repair. Weight}. 


Partitioning is applied recursively until primitive designs exist for each element of the 
partitioning. For example, the set 


{ Mileage. Repair. Weight} 


do not match any of the primitive designs and is partitioned again 
{ Mileage} and { Repair. Weight}. 


The partitioning algorithm searches through various partitionings until a composed 
design can be generated. Its search order determines which design is generated first: 
manageable partitions that are generated first tend to be encoded by the most effective 
graphical relationships. For example, Price and Mileage are encoded by position, while the 
other relations are encoded by less effective retinal techniques. This fact makes it possible for 
the application to designate that preferential treatment be given to some of the input 
relations. 


Relations can be partitioned as well as sets of relations. For example, the automobiles 
can also be described with the Au/o relation: 


Auto: Car — Dollars.Mpg. Pounds. Judgements. 


Since a primitive design does not exist for this complex relation, it must be partitioned into 
the four binary relations used above, which correspond to the four dependent domains of the 
functional dependency. Armstrong’s axioms from database theory ensure that the information 
is the same in both formulations [19]. 1 describe other relation partitioning techniques 
elsewhere [14]. 


7. Selecting Primitive Graphical Designs 


Given a partitioning of the automobile data into individual relations, the next task is to 
find designs for each relation. For example. the Price relation can be encoded by positioning 
points that represent the cars on an axis, by making the area of the points correspond to the 
price of the car, or by placing bar lines between an axis encoding the cars and an axis 
encoding the prices. Each of these alternatives is a primitive design that can be selected to 
encode the relation. 


The selection phase of the synthesis algorithm searches a set of primitive graphical 
designs for designs that state the information in the partition. The range of designs searched 
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by the synthesis algorithm is determined primarily by the set of available primitive designs. 
Figure 7 describes the set of primitive designs that I have developed to encompass most of the 
common designs used in presentation graphics [2, 18, 12, 16, 5, 3]. This set is based on 
Bertin’s vocabulary of graphical encoding techniques; it has been expanded to include 
additional techniques and encoding conventions used in presentation graphics. 





Encoding Technique — Design 

Single Position. Horizontal axis, Vertical axis 

Apposed Position Line chart, Bar chart, Plot chart 

Retinal List Color, Shape, Size, Saturation, 
Texture, Orientation 

Map Road map, Topographic map 

Connection —_—. Tree, Acyclic graph, Network 

Misc. (Angle, Contain, . . .) ‘ Pie chart, Venn diagram, ... 


Figure 7: A Useful Set of Primitive Graphical Designs . 





71 Expressiveness Criteria 


The selection phase searches for a primitive design that expresses exactly the information 
in the partition. For example, the Price relation can be expressed by a single position design 
(see the horizontal axis design in Figure 8). Expressiveness criteria, which refer to the 
structural properties of the input relations, determine when a primitive design can be selected. 
For example, if the Price relation were one-to-many, a single position design could not 
express the input information because every point, representing a car, has a single position on 
the price axis and thus encodes a single price. 


3500 6000 





Car price for 1979 


Figure 8: Horizontal Single Position Design of Car Price 





2 Single position designs can be used for many-to-one relations by using a “jittering™ algorithm [4] during 
rendering to spread the points above the axis so that they remain identifiable. 
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Encoding Technique Expressiveness Criteria 

Single Position XY (X is nominal) 
Apposed Position XXY (X,Y not nominal) 
Retinal List X,orX¥— Y  (X not quantitative) 
Map L— Xj,... (L is a location) 
Connection XXX (X is nominal) 
Misc. (Angle, Contain, . . .) Generally, X X Y 


Figure 9: Expressiveness Criteria for the Primitive Designs 


Each primitive design has an associated expressiveness theorem that describes the 
information it can express (see Figure 9). Elsewhere I show how to formulate and prove such 
theorems [13, 14]. These theorems place two types of restrictions on the information to be 
expressed: structural and set type. Structural restrictions refer to the standard relational 
properties such as arity (number of domains) and functional dependencies. Set type 
restrictions refer to the fact that there are three major types of measurements: nominal, 
ordinal, and quantitative [17]. Nominal measurements (e.g. {USA, Japan, . . .}) determine 
equality or inequality. Ordinal measurements (e.g. { Great, Good, . . .}) determine relative 
ordering. Quantitative measurements (e.g. {0,32,273}) determine the actual numeric values.* 
Figure 10 shows the expressiveness of various retinal techniques for nominal, ordinal, and 
quantitative information. 


Sometimes expressiveness evaluation can be too strict for an application’s presentation 
needs. For example, the database question-answering system that generated the automobile 
relations might have been answering a comparative question about 1979 automobiles rather 
than a question about the details of the individual cars. Nevertheless, strict expressiveness 
requires that the input information be expressed exactly, including the names of the cars. 
Strict expressiveness can be accomplished by labeling the points in the scatter plot in Figure 
6, but these labels will obscure the encoding relationships that must be perceived to answer 
the comparative question. [n this case, the application requests a “coarse level of detail” that 
specifies that the car names can be excluded from the expressiveness evaluation. 


3 Some quantitative measurements permit the comparison of the ratios of values. For example. Kelvin 
temperatures do. but Centigrade temperatures do not. 
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Nominal Ordinal Quantitative 
Size - S @ 
Saturation ~ @ @ 
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Figure 10: Expressiveness of Retinal Techniques. The — indicates that size and saturation 
Should not be used for nominal measurements because they will probably be perceived to be 
ordered. The * indicates that the full color spectrum is not ordered. However, parts of the 
color spectrum are ordinally perceived [20]. 


7.2 Effectiveness Criteria 


When several primitive designs express the information in the partition, the selection 
phase uses effectiveness criteria to select the best design first. For example, the Price relation 
can also be expressed with a retinal list design: points encode the cars and the area of each 
point encodes the price (see Figure 11).* An effectiveness evaluation is required to determine 
which of these designs can be interpreted more accurately. As described in Section 2, 
accuracy of interpretation is determined by human perceptual capabilities, which are poorly 
understood. I have developed a conjectural theory of effectiveness that is based on current 
experimental evidence and graphic designer knowledge. This theory can be replaced by a 
better one when it develops. 


The core of this theory is an accuracy ranking developed by Cleveland and McGill of the 
perceptual tasks associated with the graphical encoding of quantitative information [6]. When 
people interpret graphical presentations, they must accomplish perceptual tasks such as 
estimating position or area. Cleveland and McGill have conducted experiments that show 
that some tasks associated with quantitative information can be accomplished more accurately 
than others. They have developed a ranking of these quantitative tasks that is consistent with 


4 Retinal lists can include labels even when the application has indicated a coarse level of detail. because labels do 
not obscure the other encoding relationships. 
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Figure 11: Retinal List Design of Car Price 


their experimental results. I have developed similar rankings for the nominal and ordinal 
tasks. Because my rankings are based on graphic designer knowledge and formal analysis 
rather than empirical evidence, they should be treated as conjectural [14]. Using these 
rankings, which are summarized in Figure 12, it is clear that the single position design is more 
effective than the retinal list design for the Price relation. 


8. Composing Graphical Designs 


The final task is to compose the partition designs into a unified design. The key idea is to 
merge the parts of the partition designs that encode the same information. For example, 
when the Price partition is encoded with a vertical axis design and the Mileage partition is 
encoded with a horizontal axis design, both designs contain the encoding 


Encodes (Points.Cars). 


Therefore, the points encode the same information and the partition designs can be merged 
into a scatter plot design. 


Figure 13 lists three composition operators based on the merging of designs. Mark 
composition is applied when the partition designs contain marks (points, lines, or areas) that 
encode the same information. For example, the scatter plot in Figure 6 is the mark 
composition of four primitive designs: two single position designs, a color list design. and a 
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size list design. Axes composition is applied when the corresponding axes in the partition 
designs encode the same information. For example, axes composition generates multiple line 
charts. Axis alignment is applied when a pair of corresponding axes encode the same 
information. For example, axis alignment generates the aligned bar chart in Figure 14. Axis 
alignment is less effective than the other two operators because it does not actually merge the 
designs; it should only be utilized when the others fail. 


Quantitative Ordinal Nominal 


Position —ee———— POSITION 
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Figure 12: Ranking of Perceptual Tasks. The tasks shown in shaded rectangles are not 
relevant to that type of data. 


Shape 








Operator Applicability Condition 
Mark composition Marks encode the same information 
Axes composition Corresponding axes encode 
the same information 
Axis alignment Pair of axes encode the same information 


Figure 13: Four Composition Operators 


Applicable composition operators.may not succeed. For example, the selection phase 
initially assigns the Price and Mileage partitions to the same horizontal axis design, because 
they have the same structural and set type properties.. Mark composition is applicable 
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they have the same structural and set type properties. Mark composition is applicable 
because the points in both partition designs encode the same information; it fails because the 
horizontal positions of the points are inconsistent and therefore one of the relations cannot be 
expressed in the composition. When composition fails, backtracking occurs so that other 
designs and partitionings can be considered. In this case, a vertical axis design is selected for 
the Price partition and the scatter plot is generated. The details about the evaluation of the 
consistency of compositions can be found elsewhere [14]. 


9, Media Sensitivity 


Because the synthesis algorithm searches for designs that express the input information 
effectively, it can be sensitive to the capabilities of the output medium. For example, a color 
medium enables APT to generate the scatter plot shown in Figure 6. A monochrome medium 
restricts APT to the aligned bar chart shown in Figure 14, which is the axis alignment of four 
bar chart designs.> In a color medium, the Repair relation can be encoded by a color list 
design. However, in a monochrome medium, the only available retinal list designs for ordinal 
information are texture, saturation, and size (see Figure 10). The texture design was rejected 
by an expressiveness criterion because the rendering portion of APT does not implement 
texture. The saturation design was rejected by an effectiveness criterion because five levels of 
gray blend together, making the repair values indistinguishable. The size design can be 
selected. However, size must be used for one of the other relations in the scatter plot design 
and mark composition cannot merge two designs that use size to encode different sets. APT 
ultimately settles on the aligned bar chart design. 


5 APT always generates the aligned bar design before the scatter plot design when the application requests a fine 
level of detail, because the bar charts contain the names of the cars: labels on points in the scatter plot obscure 
information. When the output device is a computer monitor. however, it is generally better to request a coarse 
level of detail. because omitted details can be obtained by interacting with the display using techniques such as 
pick-sensitive objects and pop-up windows. 
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Figure 14: Aligned Bar Chart of Automobile Data 
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10. Prototype Implementation: APT 


APT (A Presentation Tool) is a prototype presentation tool based on the results described 
in this paper. It consists of a design component followed by a rendering component. The 
design component uses logic programming techniques to implement the synthesis algorithm. 
The rendering component uses object-oriented programming techniques and a device- 
independent graphics package to render the resulting designs. The rendering component, 
which was not the focus of this research, determines which primitive designs in Figure 7 are 
used by APT. As of this writing, the following primitive designs remain unimplemented: 
orientation, texture, line charts, maps, and miscellaneous. Even with these restrictions, the 
prototype can generate a wide range of interesting presentations. Figure 16 contains three 
examples. 


The logic program implementing the synthesis algorithm is based on a depth-first 
backward chaining algorithm called Residue [8]. Residue is useful for design problems 
because predicates describing the design can be declared to be assumable. For example, 
Figure 15 describes APT's bar chart rule. The assumables are the Encodes relations of the bar 
chart primitive design. When these predicates are assumed, they can be used to compose this 
design with others and to render the final image. 


APT was developed on a Symbolics Lisp Machine using MRS. a representation system 
[15]. Normally, designs are generated in 1-2 minutes and images are rendered in less than a 
minute. However, APT is a functional prototype and no effort has been made to make it 
efficient. The logic program is about 200 rules (in 14 pages). and the rendering system is about 
60 pages of Lisp code. 


rel = x—>y A — Numeric (x) A Numeric (y) A Expressiveness 
Cardinality (x)< 20 A Effectiveness 
LineObjs (barchart,lines) A VertAxis (barchart,vaxis) A. Assumables 
Encodes (lines.x) AV Encodes (vaxis.y) A Assumables 
Length (lines.len.vaxis) A. Encodes (len.rel (x)) A... Assumables 
=> Presents (barchart.rel) 


Figure 15: APT’s Bar Chart Rule. The expressiveness conditions state that the relation must 
be a functional dependency from a non-numeric set to a numeric set. The effectiveness 
condition limits the number of bars. The assumables connect the relation and the 
presentation. The independent set is connected to the bar lines and the dependent set is 
connected to the vertical axis. 
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Figure 16: Three Designs Generated by APT. The layered bar chart describes the number of 
Ph.D. students that graduated in each quarter for a range of years. A single input relation was 
partitioned to generate this design. The multiple scatter plot describes a month of ozone 
measurements for two New York cities. The axes composition operator generated this design. 
The “layered” graph describes the prerequisites and proposed schedule of some computer 
science classes. The mark composition operator generated this design. 
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11. Conclusion 


APT demonstrates that a synthesis algorithm based on composition can be used to 
automatically generate correct and effective designs for a wide range of input relations. One 
advantage of the synthesis algorithm is that its search can be sensitive to the output medium 
as well as the structural properties of the input relations. Another advantage of the synthesis 
algorithm is that it is flexible. The range of designs that are searched can be modified by 
changing the set of primitive designs or the set of composition operators. The search order 
can also be modified by changing the expressiveness and effectiveness criteria. This is 
important because presentation graphics and human perceptual abilities are not yet well 
understood. As our understanding advances, modifications can be made to the synthesis 
algorithm so that it will generate even more effective designs. 


Many problems associated with the automatic generation of graphical designs remain to 
be solved. The engineering of robust presentation tools will raise many questions about the 
correct search criteria. Animated or 3D presentations appear to be very powerful techniques 
for presenting symbolic information and should be incorporated into future tools. Larger 
search spaces, which can be generated with finer grained sets of primitive designs, make it 
more difficult to search for an appropriate answer in real time. However, it may be possible 
to build a discovery system that searches this larger space for unusual but effective designs. 
These designs can then be cached with the primitive designs described in this paper to form 
an efficient, comprehensive search space. 
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A number of research projects have explored the use 

of computer graphics to explain how things work. Their 
approaches have been as diverse as dynamic, videodisc- 
based movie manuals,’ directed-graph-structured pictorial 
documents, and animated algorithm simulations. In these 
systems, creating a presentation may require the collabo- 
ration of many people, including subject matter experts, 
authors, designers, photographers, illustrators, and editors. 
Thus the presentation design process is expensive and time- 
consuming. It is our hope that someday there will be 
computer-based experts that can communicate with a user 
through pictures, words, and sounds. Systems of this sort 
would not only have to possess knowledge of the subject 
areas in which they were expert, but must also know how 
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to design and produce effective visual presentations.’ 

Although much attention has been paid to the problems 
of rendering prearranged collections of 2D or 3D objects, 
automating the design of pictures is an area of computer 
graphics that is relatively, although not totally, unexplored. 
Work in knowledge-based graphics has included animation 
scripting and diagram layout, design of business graphics,’ 
generation of diagrammatic information displays, ani- 
mated instructions for a CAD system’ and automatic 
synthesis and layout of icons in a graphical database 
interface’ 

APEX (Automated Pictorial EXplanations) is an ex- 
perimental system that we have built to examine some of 
the problems involved in depicting actions in a 3D world. 
It contains two components: one part, described here, 
which creates pictures of parts of the world, and a second 
which designs and lays out displays containing these 
pictures. 

APEX differs from previous work in that it attempts to 
depict existing worlds of 3D objects and actions by 
analyzing their relationships and then determining what 
part if any each should play in a picture, how each should 
be rendered, and how each should affect the viewing 
specifications for the picture. APEX is essentially an 
intermediary between an expert system that figures out 
what actions must be performed to solve a problem and 
rendering software that scan-converts the explanatory 
pictures that APEX generates. In our work we treat both 
the expert system and the rendering tools as “black boxes,” 
although they are, of course, subjects of extensive research 
in their own right. 


Picture generation 


What kind of pictures should APEX generate? Much 
work in computer graphics has concentrated on the syn- 
thesis of realistic pictures. There are many situations, 
however, in which a realistic picture can provide too much 
detail to communicate as effectively as a simpler, stylized 
picture. Books on writing technical manuals, for example, 
recommend deleting unnecessary details and highlighting 
objects of interest in drawings.’° 

A number of attempts have been made to formalize rules 
for presenting quantitative information in the form of 
charts and graphs.'’'* APEX incorporates the crude be- 
ginnings of a model for the creation of pictures that show 
actions, such as pushing, pulling, or turning, being per- 
formed on physical objects. 

Consider, for example, how to decide what objects to 
include in a picture. Suppose the picture is meant to tell a 
person to turn a knob on a piece of equipment. One 
solution might be to include in the picture the knob, the 
equipment on which it is located, and, perhaps, the hand 
turning it. If the equipment is the only thing in the room 
and the knob the only feature on it, this might be 
appropriate. But what if the equipment has many knobs 
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and switches and the room contains many similar pieces of 
equipment? There must be a compromise between th 
overwhelming detail of showing with photographic ac 
curacy everything that a person might see and the potentia' 
ambiguity of showing only those objects that participate ir 
the action. 

We would like the contents of the picture that we 
generate to depend on whether the person knows thx 
location of the knob on the equipment or even the locatior 
of the equipment itself. If the person is unfamiliar with the 
equipment’s location, showing it relative to the rest of th 
room may help. The experienced user, on the other hand 
may not need the extra context. Also, the appearance of 
the objects themselves should influence the picture’s design 
If the knob is one of a row of identical knobs, showing thi 
knob in context may be more worthwhile than it would be 
if the knob were unique in shape and color. Rendering th 
knob with greater detail may be useful if the added detai 
reveals differences between it and similar knobs with which 
it may be confused. 

In APEX, we have attempted to eliminate unneedex 
detail in pictures while emphasizing important features 
Our approach has been to design a system that uses rules t< 
govern each aspect of a picture’s composition by deter 
mining which objects will be depicted, which renderinr 
style will be used for each, and which viewing specification 
will be employed. 


System overview 


APEX is initially provided with information about <¢ 
world of objects, the actions to be performed on it, anc 
what the user already knows. 


Actions. The actions to be depicted are those “per. 
formed” by a problem solver, an AI system that plans how 
to accomplish the task that we want to tell someone abou' 
by “doing” the task itself. APEX talks to problem solver: 
that use rules about a task to be performed, and knowledge 
about the current state of affairs, in order to break down ¢ 
high-level task into a hierarchy of lower-level actions." 
Each kind of action is represented as a frame,'* a collectior 
of facts about the action and its participants. When the 
problem solver executes an action it instantiates one o 
these archetypal frames. Associated with an action frame 
instance are information about the important objects tha 
participate in it and the nature of their roles, any othe 
actions that have to be performed as subactions to accom 
plish it, when it should be executed relative to the othe: 
actions, and those changes that it effects in the en 
vironment. 

Because the problem solvers run quite slowly, APEX 
does not communicate with them interactively. Instead, the 
desired problem solver is run in advance and information 
about the structure of the solution it generates is saved 
automatically in a detailed execution trace. APEX reads in 
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the trace and uses it to recapitulate the actions of the 
problem solver incrementally, without incurring the 
decision-making overhead. 


Objects. APEX uses the same object database as does 
the problem solver. Objects are hierarchically structured as 
trees of 3D parts. Leaf nodes are physical objects with 
properties such as material, color, size, shape, and position, 
while internal nodes are assemblies of leaf and internal 
nodes containing only transformations. Objects have in- 
formation about their function and are also characterized 


children by merging adjacent objects and eliminating 
relatively small objects. Each simplification is also projected 
onto the viewplane and the differences between it and the 
unsimplified children’s projection is computed. The dif- 
ference metric is based on the size and color of correspond- 
ing areas in the two projections. The simplification that 
results in the smallest number of objects, and whose 
difference is less than a pragmatically chosen maximum, is 
used. The process is then propagated up the tree to the 
root. Because the detail-removal process runs quite slowly 
it is currently performed as a preprocessing step. 





Figure 1. Detail removal. (a) The hierarchical object database originally contains information about the physical 
properties of leaf nodes—in this example the individual buttons and panel of a console. Internal nodes, such as the 
console or set of buttons, represent assemblies that may be drawn only by drawing all the objects at their descendent 
leaves. (b) The automated detail removal procedure described in the text combines adjacent subobjects and eliminates 
relatively small ones to produce a simplified version of the object. Detail removal is performed for each internal node to 
produce physical property descriptions in the same format as those of the leaf nodes. A simplified version of the object 
is displayed by traversing the hierarchy to the desired depth and drawing only the deepest objects encountered. The 
level of detail displayed may be varied across the object by selectively exploring some parts of the hierarchy more 





deeply to show them in more detail. 


by the relationships (such as “on” and “in”) that they bear 
to one another. 

As mentioned above, we would like to be able to display 
objects at varying levels of detail. Physical properties are 
initially associated only with the leaves of APEX’s object 
trees. Therefore, we have developed a detail-removal 
process that associates with each nonleaf object a simplified 
version of.the object properties possessed by its children. 
This makes it possible to draw a high-level approximation 
of an object without processing its children, selectively 


progressing down the tree only when more local detail is _ 


desired.'° (See Figure 1.) 

The detail-removal method that we have used begins by 
processing nodes whose children are all leaves. The pro- 
jection of a node’s children onto a selected view plane is 
computed. The system then proposes simplifications of the 


November 1985 


Associating approximate physical properties with in- 
ternal nodes also allows objects to be compared hierarchi- 
cally. The ability to compare two hierarchical objects 
quickly is exploited by APEX to find objects that are 
similar to some target object and to determine how they 
differ from it. First, the root nodes of both objects are 
compared with regard to size, shape, and material. If the 
differences are big enough to be considered significant (an 
ad hoc determination), the comparison stops; otherwise, 
the objects’ next levels are recursively compared, with 
subobjects matched according to their positions in their 
parents (so that things occupying roughly the same relative 
position in their parents are compared). 


User knowledge. Associated with each object and action 
is some information about what the user knows (for 
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example, an object’s approximate location). This informa- 

tion forms an extremely rudimentary user model. APEX 

updates the user model whenever it depicts an object, 

reflecting a presumed increase in the user’s familiarity with 

the object’s location. This information guides the picture- 

making process by determining the amount of context to 
include. 


Figure 2. A fully 
detailed view of a set 
of objects for which 
APEX can create 
pictures. Shown here 
are parts of a sonar 
system. The three 
large cabinets are 
(from left to right) 
the receiver, the 
transmitter, and the 
interface. A small 
' speaker hangs on the 
= wall to the right of 

| the transmitter. 


The picture representation 


APEX creates a picture by performing operations on a 
frame data structure in which the picture is represented. 





This picture frame has slots for the important attributes of 
a picture: the objects that it contains, its lighting, and its 
viewing specifications. Frame slots may be made to affect 
one another through the use of rules that fire when a slot is 
modified, causing changes in others. APEX also marks the 
objects that are inspected with its comparison routine to 
indicate where their differences and similarities lie. 

In order to render a picture, APEX first turns its internal 
picture data structure into a precise 3D scene specification 
format accepted by several locally written rendering sys- 
tems. APEX generates a picture specification by traversing 
those parts of the object hierarchy referenced by the picture 
frame. The determination of what rendering style to use 
and how far to travel down each part of the hierarchy is 
based on information associated with the objects while a 
picture is created, as described below. 


Depicting a single action 


The picture creation process will be illustrated by an 
example taken from one world for which APEX can 
generate pictures. Figure 2 shows this world’s objects, the 
components of a stylized sonar system, in full detail. There 
are three large cabinets—a receiver, a transmitter, and an 
interface. A small speaker cabinet hangs on the wail. 
Figures 3-10 show the steps in the creation of a typical picture. 
They were generated by stopping APEX at each step of the 
picture-creation process and sending the partially com- 
pleted picture specification to the renderer. The completed 
picture is intended to tell the viewer to pull out the drawer 
of the large center cabinet, using its middle handle. APEX 
was initially told that the picture’s viewer had no idea 
where any of the objects depicted are located. 

First, APEX initializes the viewing and lighting speciti- 
cations based on the location of the viewer who is to 
perform the action. Associated with each depictable action 
frame is a priori information about the objects that play 
important roles in the action and therefore must be 
depicted. We refer to these as the picture’s frame objects. 
These are the objects about which the picture will crystal- 
lize. APEX finds the frame objects for the particular 
instance of the action being performed and adds them to 
the picture. 
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Figure 3. The first step in the 
metamorphosis of a picture depicting the 
command to open the drawer of the 
transmitter (the center cabinet shown in 
Figure 2): Frame objects, specified by 
information associated with the action 
frame being depicted. The viewer is 
initially assumed to know nothing about 
the location of any of the objects. The 
succeeding steps are shown in 

Figures 4-10. 


In Figure 3 they are the transmitter drawer (the object to 
be opened) and its middle handle (the object by which the 
opening is performed). APEX has rules that fire when an 
object is added to a picture. These rules discriminate on the 
kind of object being added. Frame objects cause the 
viewing specifications to be modified to include the objects 
in the viewplane in their entirety. The dark-blue back- 
ground of each picture represents its extent. 

The user may have some knowledge about each of the 
frame objects, including some notion of where it might be 
found. This information is currently represented as a part 
of the object hierarchy in which the object is known to 
reside. APEX follows the hierarchy up from each frame 
object until it finds an object with which the user is 
familiar. These ancestor objects are added to provide 
context. Figure 4 shows the transmitter of which the 
drawer is a part. 

APEX next attempts to find landmarks, objects whose 
properties should make them good reference points for 
locating other objects. These properties may be physical 
properties such as color, shape, or size, but may also be the 
user's familiarity with an otherwise nondescript object. 
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Figure 4. Context objects, all 
ancestors of the frame objects 
upto the first objects with 
which the user is familiar. 





APEX currently has a rather naive set of criteria for 
finding landmarks for an object. It inspects those objects 
that are physically close to the object, compared to the 
other objects that comprise the object’s parent. From these 
it selects objects that have relatively unusual physical 
properties or that are already familiar to the user. 


Figure 5. Landmark 
objects, intended to 
help the viewer locate 
the frame objects. 





Figure 5 shows the landmarks that APEX selected for 
the transmitter, the drawer, and the handle. In this case a 
landmark is found for the transmitter alone: the speaker 
cabinet hanging on the wall behind it. Although all the 
objects included in the picture so far have been depicted in 
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their actual colors, the speaker is not. When the picture is 
rendered, a subdued rendering style is chosen for objects 
that were selected as landmarks, because they do not 
directly participate in the action. The current implemen- 
tation realizes a subdued rendering style by blending the 
object’s actual color with that of its parent, in this case the 
background color. 

For each object included so far, APEX searches in its 
parents for those objects that are roughly similar to it, and 
thus ones with which it could potentially be confused by 
the reader. These similar objects are added to the picture in 





Figure 6. Similar objects, added to help disambiguate the 
objects added so far from others that may be confused with 
them. 


the same manner as the landmark objects. Figure 6 shows 
the objects found: the receiver and the interface, which are 
roughly the same size, shape, and color as the transmitter, 
and the two additional handles of the drawer. No similar 
objects were found for the drawer or the speaker. 

Each similar object is then compared with the object for 
which it was selected, using the hierarchical comparison 
routine described previously. As the comparison proceeds, 
information is stored with each part of the object con- 
sidered, indicating whether the part was similar to or 
different from the other parts to which it was compared. 
This information is used later to determine how the object 
and its parts will be rendered, with disambiguating detail 
being included down to the level at which significant 
differences are found. Figure 7 shows this additional 
detail. The receiver’s children, three drawers and a panel, 
are depicted, since at this level the major difference 


-between it and the transmitter (two additional drawers) is 


discovered. The interface’s door and panel are shown for 
the same reason. Lower level detail is not shown since it is 
not needed for disambiguation nor does it participate in 
the action being depicted. 

APEX knows about another class of objects, those that 
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Figure 7. Disambiguating detail, intended to fais distin- Figure 9. Siblings of those objects at _ highest level at 
guish the frame and landmark objects from those that which APEX added context. 


have been found to be similar to them. 


must be included in a picture to prevent it from looking 
incorrect. For example, if an object is supported by 
another that would be visible given the current viewing 
specifications, then this supporting object is included. In 





Figure 8. Supporting objects, added if they are visible. 


Figure 8 the floor has been added to the picture because it 
supports the three cabinets. 

Finally, any remaining objects in the highest level of the 
object hierarchy being depicted are added, with the viewing 
specifications being modified just enough to indicate their 
presence. Figure 9 shows the only remaining object in this 
case, the left wall. 

So far, the picture does not actually indicate the action 
to be performed. APEX currently knows how to indicate 
actions that involve motion by creating and including in 
the picture what we call a meta-object—an object that 
doesn’t actually exist in the world being depicted, but that 
will be used to refer to objects in that world. The meta- 
objects that APEX can create are arrows drawn in the 
direction of a translation or rotation. Some simple rules are 
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Figure 10. A meta-object arrow, added to indicate the 
motion of the drawer. 


used to position the arrows based on the point at which 
force is applied, coupled with the angle and center of 
rotation or the distance and direction of translation. Figure 
10 shows the arrow APEX has created to depict the 
motion of the drawer out along the z-axis. | 


Depicting a sequence of actions 


APEX considers actions in the order in which they are 
performed. For each action that APEX knows how to 
depict (currently those that manipulate objects directly 
through translation or rotation), a picture will be created. 
The current implementation enforces a one-to-one mapping 
between depictable actions and pictures. 

Figures 11-16 show the sequence of pictures APEX 
created to show a series of actions, beginning in Figure !1! 
with the action depicted in Figure 3. Note that the pictures 
are not supposed to be self-explanatory, but should ulti- 
mately be displayed with accompanying text. 
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Figure 11. The first picture in a sequence created by APEX 
to show a series of actions: Open the transmitter’s drawer. 
The rest of the sequence is shown in Figures 12-16. 





Figure 12. Rotate the Figure 13. Open the 
drawer about its central drawer’s top panel. 
support. 


Figure 12 is intended to show that the now open drawer 
is to be rotated about an internal support. The pictures are 
designed to be presented sequentially, and therefore attempt 
to take advantage of information presumed to be imparted 
by previous pictures. Here, since the user is now assumed 
to know which drawer is being rotated, the transmitter and 
its landmark and similar objects are no longer included to 
provide context. In fact, the transmitter cabinet and the 
supporting rails along which the drawer rides are shown 
only because of their role in supporting the drawer. Figure 
13 shows that a panel on the top side of the rotated drawer 
is to be opened. 

In Figure 14, which instructs the user to close the panel, 
one of APEX’s limitations is obvious. The arrow, although 
correctly drawn, is difficult to interpret because of its 
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Figure 14. Close the top panel. 


position relative to the viewer; the current system does not 
check for this. Figure 15 tells the user to rotate the drawer 
back to its original position. Note that by showing only a 
limited amount of the drawer’s detail, the picture indicates 
more clearly that the motion is to be applied to the drawer, 
not, for example, to a panel that might lie at the tail of the 
arrow. Figure 16 tells the user to close the drawer. 





Figure 15. Rotate the Figure 16. Close 
drawer back to its original the drawer. 
position. 

implementation 


The APEX testbed is written in Franz Lisp on a VAX 
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11/780 running Berkeley UNIX 4.2. The problem solvers 
for which APEX creates pictures are written in micro- 
Nasl/ Frail’ Pictures are generated in a 3D scene repre- 
sentation format that is interpreted by a variety of scan 
conversion systems.° APEX required about 1 minute of 
CPU time to generate the specifications for the pictures in 
Figure 11-16. For debugging purposes the system allows 
pictures to be scan-converted as their specifications are 
generated, using a Lexidata Solidview z-buffered graphics 
system. This device was used to make Figures 2-16 and 
took approximately 10 seconds of real time per picture. 


Research directions 


APEX is an actively evolving research tool. Its limita- 
tions are many and continually changing. A number of its 
components are temporary placeholders for better versions 
yet to be developed. Some of APEX’s limitations have 
been imposed for reasons of efficiency. For example, the 
current system operates only on cuboid objects. Other 
restrictions reflect deeper issues, some of which are dis- 
cussed below. 

Pictures and actions. The one-to-one mapping between 
depictable actions and their pictures is too limiting. It 
should be possible to make pictures of high-level actions 
that abstract their low-level actions into fewer pictures than 
the fully detailed low-level actions would normally require. 
If a series of actions must be performed in strict sequence, 
the system should ensure that this sequence is depicted 
adequately. For example, some sequences of motion per- 
formed on one object may be unambiguously concatenated 
to form a single arrow’s trajectory. Other sequences per- 
formed on one or more objects may be realized more 
clearly by coding the set of arrows in one picture as an 
ordered sequence, or by using separate pictures. 





We would also like to create 
pictures that abstract or modify 
certain properties of 
objects besides detail. 





The current system makes a single picture even if an 
action involving several small objects is surrounded by a 
much larger context. A better alternative would be a series 
of successive locator pictures that show with increasing 
specificity where to find the objects, or a detail inset that 
shows an enlarged and more detailed version of part of a 
picture. We would also like to create pictures that abstract 
or modify certain properties of objects besides detail. For 
example, interacting objects might be positioned differently 
in a picture so that their functional relationships rather 
than their actual physical locations are shown. 
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Detail removal and comparison. The approximations 
produced by detail removal may need to change when a1 
object changes, either by motion of its parts or by change 
in its hierarchical structure (for example, when a part ir 
removed during disassembly), or when a viewpoint changes 
Some performance improvements could be gained b' 
taking advantage of coherence between pictures. Thr 
hierarchical nature of information associated with botl 
detail removal and object comparison can also be used t 
limit how far the effects of a change have to propagate. 

The methods currently used for detail removal anc 
comparison are far from satisfactory, even for static 
objects. In particular, the detail removal procedure pay: 
attention only to the gross area covered in the projection. I 
does not, for example, take into account silhouette in 
formation. The comparison procedure does not deal wel’ 
with objects that have similar visual appearance but sig 
nificantly different hierarchical structure. Both detail re 
moval and comparison currently disregard the functiona’ 
importance of a part or its functional similarities to others 


Rule base. The implementation of the system’s picture 
creation process is still far too ad hoc: No satisfactory 
mechanism is in place for easily developing or encodiny 
design rules, let alone allowing a nonprogrammer graphic 
designer to specify them. 


Conclusions 


APEX is a research testbed for exploring problems ir 
the automatic design of sequences of pictures that depic! 
the performance of actions on objects. Facts about actions 
participating objects, and user knowledge determine the 
objects included in a picture, their level of detail, rendering 
style, and the picture’s viewing specification. APEX’: 
graphic design knowledge, however, is limited, static, and 
difficult to encode. It is a research vehicle, not a practical 
system. We believe that far more powerful systems will one 
day make possible interactive, automatically generated 
presentations that communicate effectively using pictures 
as well as text. 
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Traditional Document Production 
Techniques 


Richard J. Beach 
University of Waterloo 


Researchers make substantial use of books and journals in their everyday work. 
However, few people understand how those documents are produced. Only when they 
decide to write their own book or to edit a scholarly journal do they become involved in 
the mysterious world of the graphic arts. This survey is intended to help the reader to 
understand document production, to appreciate the many diverse roles and skills 
necessary, and to realize the vast number of details and decisions involved in 
producing high-quality documents. 





1. How do books get produced? 


An interesting review of how books are produced is contained in the anthology One 
Book/Five Ways [AAUP, One Book/Five Ways]. This reports on a comparative publishing 
experiment in which five university presses prepared the same book for publication: the 
University of Chicago Press, the MIT Press, the University of North Carolina Press, the | 
University of Texas Press, and the University of Toronto Press. | 


The procedures used in each press were remarkably common. Although the approaches 
varied somewhat, all involved the stages of acquisition, market and preliminary cost 
estimation, editorial revision, design, production, sales, and promotion. Each press 
documented their procedures, their forms, and the guidelines they applied to the various 
processes. One Book/Five Ways contains.a rich collection of raw material for anyone 
interested in the publishing process. 


In particular, the report includes the style guidelines from each of the presses. These 
guidelines establish the publisher's house style, and govern editorial. graphic design, 
illustration, composition. and typesetting decisions. Perhaps the most well-known style 
guideline for scholarly documents is The Chicago Manual of Style. which was referenced by 
several presses in this experiment. although most have their own refinements and special 
instructions. 
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An important feature of the traditional book production process is the parallelism 
achieved through several groups working on distinct aspects of a book. When a manuscript 
atrives at the press for consideration, it is quickly copied and sent out for two or more 
independent reviews to decide whether to publish the work. Once the decision to publish is 
made and the completed manuscript arrives from the author, copies are sent simultaneously 
to (1) the production editor, who establishes a job docket to track all of the subsequent stages 
of the publication, (2) the copy editor, who makes editorial revisions, and (3) the graphic 
designer, who designs the book and its illustrations. This parallelism is shown in Figure 1 for 
a simplified and hypothetical publication process. 


Other parts of the document publication process also involve parallelism. If the book is 
to have a jacket or cover illustration, that illustration is undertaken while the insides of the 
book are prepared. The table of contents and Library of Congress submission forms are 
prepared as soon as the book enters production to ensure that the imprint page and the front 
matter of the book are ready for printing. 


Production 
Editor 


Author’s 


‘ Page 
Manuscript § 


Assembly § 


Printing 





Graphic : 
Design | = Illustration 





Figure 1. TRADITIONAL GRAPHIC ARTS PROCESSES involve considerable parallelism 
in the procedures for publishing a manuscript. The author’s manuscript is copied and sent to 
the production editor, the copy editor, and the design/illustration department. Edited pages 
are typeset by the composition staff who are guided by the design of the document. The 
typeset manuscript and the illustrations are then assembled into pages in preparation for 
printing. 
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The index is often on the critical path near the end of the document production cycle. 
Since index entries must have the correct page numbers, the index can not be fully completed 
until all of the pages have been assembled. Typically the index entries are compiled in 
parallel with the book composition. After the page numbers are assigned on the reproduction 
pages (or page repros) the index manuscript is completed in parallel with the final 
proofreading of the book pages. 


Even with the use of electronic composition tools, preparation of back matter is on the 
critical path and inconsistent page numbering occasionally results. Such problems appear in 
the appendices of the second edition of Newman and Sproull’s Principles of Interactive 
Computer Graphics [Newman&Sproull, Computer Graphics], in which the reference citations 
all refer to a preliminary draft version, because the authors forgot to make ‘one last revision 
pass’ over the reference citations in the appendices. The second edition was typeset by the 
authors using facilities at Xerox PARC because they could complete revisions up to the last 
minute and control the accuracy of computer programs contained in the text. In a normal 
production process, there are more people checking things and hence less chance of 
oversights, such as what actually happened in the appendices. 


An area of great concern to the publisher is administration of the production process. 
Publishers usually have several projects underway at the same time because of the delays 
involving revisions and approvals from the author of a single project. The production editor 
controls the document publication process for the publisher, determining time and cost 
estimates for the publication, selecting and contracting with suppliers. tracking the parallel 
stages of the composition process, and keeping records of deadlines and expenses. In a 
journal publishing situation, the problem is compounded by the dual pressures of multiple 
authors and frequent publication deadlines for each issue. 


These process control functions are the most important contributions of publishers. 
Some publishing companies employ little more than production and marketing editors in 
house. subcontracting most of the skilled jobs such as copy editing. design. illustration, 
composition. printing. In the electronic publishing or self-publishing process, these 
subcontracted jobs are performed by the manuscript author and electronic document 
production tools will have to handle them successfully. 


The traditional document production process in the graphic arts routinely accommodates 
difficult manuscripts. Typically tables. mathematical notation, illustrations. and page layout 
are aspects of document production that are considered difficult by traditional publishers. 
The following sections discuss how each one of these areas was handled in the comparative 
publishing experiment. 
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Tables 


There were only a small number of tables in the One Book/Five Ways experiment, but 
they were always treated separately from the main body of text. Many publishers rely on the 
skill of the compositor or typesetter to handle tables: 


“A good composing room can translate almost any tabular copy in a reasonably clear and 
presentable example of tabular composition.” [Williamson, Book Design, p 160] 


The Chicago Manual of Style provides authors with the “dos and don’ts” for preparing 
tables in manuscripts. In particular, authors are expected to prepare tables on separate pages 
because the tables will be composed separately from the text. There are some cautions also. 
For instance, the University of Chicago Press no longer prefers vertical rules in tables because 
Monotype composition (using molten metal casting of individual letters), which could insert a 
vertical rule easily, is no longer economical. With phototypesetter composition, vertical rules 
are difficult and expensive: 


“In line with a nearly universal trend among scholarly and commercial publishers, the 
University of Chicago Press has given up vertical rules as a standard feature of tables in 
the books and journals that it publishes. The handwork necessitated by including vertical 
rules is costly no matter what mode of composition is used, and in the Press's view the 
expense of it can no longer be justified by the additional refinement it brings.” [—, The 
Chicago Manual of Style, 1982, p 325-326] 


Mathematics 


Although there were no mathematics in this experiment, publishers treat mathematical 
notation very differently than textual material. Kernighan and Cherry note this difficulty in 
their paper on computer typesetting of mathematics [Kernighan&Cherry. eqn] where they 
quote the following from The Chicago Manual of Style: 





“Mathematics is known in the trade as difficult, or penalty, copy because it is slower, more 
difficult and more expensive to set in type than any other kind of copy normally 
occurring in books and journals.” [—, A Manual of Style, 1969, p 295] 


Some publishers specialize in mathematical and scientific documents. They utilize both 
skilled copy editors and special suppliers to handle the difficult mathematical material. Other 
North American publishers send mathematics copy to the Far East, where hot metal 
composition provides the quality and cheap labor rates reduce the cost. 
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Illustrations 


The treatment of illustrations varied widely in the publishing experiment described in 
One Book/Five Ways. In one instance, a publisher chose to have an artist prepare line 
drawings rather than include halftone photographs because there were no convenient local 
suppliers to create halftone screens for the photographs. In contrast, another publisher 
planned photographs for the opening page of each chapter as well as for most of the 
illustrations. Generally, illustrations are prepared separately while the book is being copy- 
edited, and are then manually assembled onto the completed pages. 


Page Layout 


Examining the book design and page layout used by most publishers reveals mainly the 
results rather than the design process itself. Page dummies and sample pages are the usual 
products of the design process. Page dummies are sketches of the page layouts prepared by 
the graphic designer for approval. Sample pages are pages typeset and assembled by the 
composition supplier. Both techniques may require several iterations between designer, 
supplier, and publisher to make certain that the publisher is satisfied and that all the style 
guidelines are followed. Unfortunately, such an iterative design process generally means that 
the publisher's guidelines have never been completely specified. frustrating those attempting 
to become a supplier with new technology. 


2. Roles involved in producing a book 


The document production process is complex. To help understand the process better. 
this section examines the individual roles of people involved in producing a published 
document. Anthropomorphism. or the attribution of human behavior to some problem. has 
proven: beneficial in making complex parallel processes more easily understood [Dyment. 
Corkscrew] [Booth&Gentleman. Anthropomorphism]. An interactive paint program [Beach. 
Paint] was implemented using multiple processes. where anthropomorphism served to clarify 
and simplify the relationships of the parallel processes. Through cataloguing the roles 
involved in document production. the structure of the problem becomes apparent as a set of 
integrated processes. 
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Figure 2.5 A HYPOTHETICAL PUBLISHING PROCESS indicating the roles and their 
interactions at various stages. The horizontal axis represents elapsed time and the thin 
vertical lines join activities that begin or end at the same time. Delays or inactivity are not 
shown, but may exist at many places in the process. 


(Aside: An example of the lack of integration in electronic tools occurred when preparing 
Figure 2. There are 15 text labels and the first version of the illustration contained two 
spelling mistakes. Because the illustration was prepared with a separate illustration tool and 
was not integrated with the document, the spelling tool used on the text of this chapter was 
unable to find the mistakes in the illustration.) 


An important thing to remember while reading this categorization of roles is that the 
descriptions relate to activities and not people. Sometimes people may fill several roles at 
once, such as an author who types and composes the manuscript. or a graphic designer who 
does the layout, illustration. and paste-up. The use of document composition tools in 
universities and research labs has tended to encourage (or force) authors to take on multiple 
roles. From this experience, people may falsely conclude that each job looks easier than it is, 
especially when they are not aware of what they are doing wrong. Concentrating on each role 
separately helps us to understand the process and to realize the skills necessary to accomplish 
all aspects of that specialist's job. 
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e Author of the manuscript 


The author creates the original manuscript. Generally, the manuscript is textual material, 
although for some subject areas there will be vast quantities of mathematical notation, 
computer programs, tables, line drawings, or photographs. The author may produce several 
draft manuscripts with the assistance of a typist. Some authors now do their own typing with 
word processors or text editors. Sophisticated editorial tools, such as the diction and wniting 
style analysis tools offered in the UNIX Writer's Workbench [Cherry, Writing Tools] 
(Macdonald, Writer's Workbench] and in other commercial editing systems [Alexander, 
Editor Aids], may be used by an author to improve the quality of the writing. 


A draft manuscript is submitted by an author to an acquisition editor or journal editor for 
consideration. After a favorable publishing decision, the author completes the manuscript 
and adds front matter that may include a preface, an introduction, acknowledgements, etc. If 
the document is to be indexed or have other reference material, the author may need to 
prepare this material also. The completed manuscript is sent to the production editor, who 
begins the publication process. Some publishers will now accept manuscript submission in — 
electronic form, such as word processor diskettes or magnetic tape. 


The author may be involved in reviewing decisions made by the publisher. The copy 
editor will mark the manuscript with suggested changes and questions to be dealt with by the 
author. The graphic designer or illustrator may send drafts of the book design and illustration 
artwork for review and approval. There may also be an indexer involved, who may send the 
preliminary index entries to the author for review. The author must also check the 
composition process by first looking at the galleys and later at proofs of the assembled pages. 


e Typist 


The typist prepares the draft manuscmpt for the author using a typewriter, word 
processor, or text editor program. Typewriter composition involves only simple typography, 
typically with only a small number of type styles. Technical typing with many mathematical 
symbols is much more difficult and time consuming; some typists resort to hand printing 
symbols that are unavailable on the typewriter. The layout of typewritten material is free 
form and requirements are quite relaxed. Tables are easily laid out with fixed-width 
characters on a typewriter. | 
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The human typist frequently acts as a built-in spelling checker and copy-editing service 
while transcribing the manuscript. 


There are several drafts prepared during the creation of a manuscript. [f each draft is 
retyped to incorporate changes, there is a strong tendency to reduce the number of drafts 
because of the effort required. Often, the completed manuscript contains partial page inserts 
pasted or stapled together. 


e Acquisition Editor or Journal Editor 


The acquisitions editor solicits and reviews new manuscripts from authors. Opinions of 
reviewers are sought to determine if the manuscript should be published. The publishing 
decision is made by a publication board or a committee of journal editors and is concluded by 
the signing of a publication contract or agreement between the publisher and the author. 


e Reviewer or Referee 


A manuscript reviewer may be asked by a publisher to give one of several opinions. 
Book publishers refer to these people as reviewers, and journal editors refer to them as 
referees. Reviews made early in the process seek to establish the marketability of a 
manuscript or the appropriateness of a journal article. Later, more comprehensive reviews 
seek to assess the subject coverage, research contributions, and technical accuracy of the 
manuscript. Reviewers are generally most concerned with document content, although in 
some special cases they may also consider the format or style of a manuscript. 


Some reviewers of technical material may use their own typesetting capabilities to 
capture their comments in the complex notation of the subject area, such as mathematics or 
computer programming. In some cases, such as computer science journals, the reviews may 
even be transmitted electronically via electronic mail networks. 


e Production Editor 


The production editor controls the document production process. Initially the 
production editor deals with the author to ensure that the manuscript has all necessary 
illustrations, that all the sections of the manuscript are finished, and that permission is 
obtained to reproduce items from other sources. Copies of the completed manuscript are sent 
in parallel to the copy editor for editorial revisions and to the graphic designer for book 
design and illustration. 
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Production editors contact and select appropriate suppliers for erp? arts services when 
those services are not available within the publisher. 


To help manage and track the various stages of several publications going on 
simultaneously, the production editor maintains a production database recording the 
expected services, the date and time each service began and finished, the estimated and actual 
costs incurred, and the current status of ongoing services. This database exists either on paper 
as the job docket (a large envelope that contains all the partially completed results) or in a 
computer file. 


e Graphic Designer 


The graphic designer provides the book design and layout guidelines. This design can 
only be done effectively when the entire manuscript is available, although some designs are 
attempted with incomplete information and later revised during publication. The design 
guidelines are written in a specification sheet or in a Style sheet to be sent to the compositor 
with the copy-edited manuscript (see the example in the next section). 


As difficult typographic situations arise, graphic designers may design special guidelines 
for those not covered in the general scheme. such as designing the layout for tables, and 
specifying the typography for nested lists of material or for foreign language extracts. 


Artwork for the illustrations may or may not be the responsibility of a graphic designer, 
depending on the designer's agreement, talents, or interests. Jacket or cover designs may also 
be the graphic designer's responsibility. 


e Copy Editor 


The copy editor ensures that the manuscript meets the publisher's house style for 
language usage, grammar, spelling, citations, references, illustration captions. table 
arrangements, headings, lists of items, foreign language phrases, etc. The copy editor deals 
with all the irksome details that would annoy the reader if they were not treated consistently. 
For example, the copy editor checks cross references from one section to another for 
completeness and verifies that captions, footnotes, and citations are numbered sequentially. 
Missing information or references and questionable corrections are sent to the author for 
action. 


Obviously electronic editing tools greatly assist the copy editor to accomplish these 
consistency checks. Displaying both the cross reference and its referent through multiple 
views (or windows) of a manuscript help to check cross references: pattern-matching search 
operations permit quick global checks: style and diction analysis tools may be of assistance in 
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checking the grammar, spelling and language usage. 


The copy editor marks the manuscript for the compositor by identifying the logical parts 
of the document, such as chapter openings, various levels of section headings, types of lists of 
items, and captions for tables and illustrations. Selecting the typographic treatment of those 
logical parts is the responsibility of the graphic designer, who specifies to the compositor the 
typography for each part in the style guidelines. 


e Indexer 


The indexer prepares the index entries for a manuscript, assigns page or reference 
numbers to each entry, sorts them, and creates an index manuscript. The indexing job may or 
may not be done by the author, although the author usually must approve the index 
manuscript. The indexer works with the manuscript in two stages: the copy-edited 
manuscript prior to composition to determine the index entries, and the page proofs to assign 
the correct page numbers to the sorted index entries. The requirement for correct page 
numbers places the index on the critical path for publication and some publications omit the 
index to reduce the delay. 


Electronic aids for indexing have not proven to be a panacea. Winograd and Paxton 
created a general set of indexing tools [Winograd&Paxton, TEX Indexing], yet the index still 
required hand editing and fine tuning. The difficulty in preparing an index is the proper 
selection and cross referencing of index entry terms or phrases. Skilled indexers still produce 
better indices than most computer-generated ones because they index on meaning, not on a 
precise phrase found in the manuscript. 


e [llustrator, Draftsman, Graphic Artist 


The illustrations for a publication are prepared from initial artwork provided by the 
author. The range of illustrations found in technical documents spans fine hand-drawn 
illustrations produced by a graphic artist, engineering drawings prepared by a draftsman, and 
photographs supplied by the author or a photographic service. Often illustrations are 
produced by tracing the author's sketches, which results in revision cycles as the author more 
clearly indicates his intentions. 
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The graphic designer may produce illustration artwork personally or may establish 
artwork guidelines for original size, reduction factors, line weight, typography, shading 
textures, materials, and so on. Reducing the original artwork improves the quality of the line 
drawings by making the line weights appear more consistent (small variations are less 
noticeable) and by sharpening the contrast in the image. Careful coordination of dimensions 
and text size on the original artwork is necessary to ensure that the reduced artwork suits the 
surrounding typography when assembled on the page. 


e Keyboarder, Coder, or Inputter 


The composition of a document is accomplished in two stages: entering the marked-up 
manuscript into a typesettable file, and then outputting the file on a typesetting device. 
Typically there is one format code for each logical part of the document marked by the copy 
editor. For example, there might be a code for the chapter opening, for each level of section 
heading, for beginning an indented list of items, and for a line of a table. The job of entering 
the marked-up manuscript may be further subdivided into several phases: assigning format 
codes to the copy editor's marks, designing the typesetter codes for each format, and inputting 
the manuscript codes and text. The style sheet provided by the graphic designer determines 
the appearance of marked up parts of the manuscript and hence the typesetter codes required. 


The typesettable files may either be entered directly. on less expensive slow typesetting 
devices, or kept on some storage medium (perhaps paper tape, floppy diskettes. ngid disks or 
magnetic tape) for more expensive high-speed typesetters. Corrections to the typeset galley 
proofs are most often made by typesetting corrected pieces of the manuscript. rather than 
correcting the files and retypesetting the entire galley. In the case of large documents, 
management of the corrections is a concern and poses difficulties for subsequent uses of the 
document. 


e Compositor, Typesetter 


The compositor produces the actual typeset output. This person may also do the 
keyboarding, but a compositor must have the skill to enter specific typographic codes for 
unusual or difficult typesetting jobs, such as for mathematics. tables, illustration labels, copy 
fit text that must fit certain dimensions. and so on. The compositor runs the typesettable file 
through the typesetting device and produces the typeset galleys or pages. 
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e Paste-up Artist 


Most documents are typeset in galley form and later cut and pasted into page assemblies. 
The paste-up artist collects all the pieces of the manuscript in their final form: typeset text, 
running heads with page numbers, mechanical artwork for the illustrations, and photographs. 
Pages are assembled by cutting apart the galleys into pieces that will fit on each individual 
page and pasting the pieces onto page layout forms. These layout forms are typically printed 
with light b/ue lines that will not reproduce on photographic negatives for printing. The 
paste-up process requires a sharp knife and a waxing machine, which coats the back of 
photopaper lightly with wax that helps the paper adhere to the layout forms when the two are 
pressed together. The wax adhesive is pliable so that the pieces can be safely separated if the 
layout needs to change. 


Paste-up only applies to photocomposition systems that produce paper or film original 
type. With metal foundry type, the assembly process involves moving metal type slugs into 
place and performing craft operations, like surrounding type slugs with furniture to provide 
the spacing for page layout, or kerning individual letter slugs by cutting off the corners to 
make them fit together better. Some legal organizations have required metal type for legal 
documents to avoid potential errors in electronic composition systems using phototypesetters 
{Leith, Metal type]; they wanted to see and verify the final type. 


The graphic designer may paste-up a document, especially if the manuscript requires 
frequent design decisions. In such cases it is quite difficult afterward to determine the rules 
and logic that were applied to accomplish some of the creative layouts. 


e Process Camera Operator, Stripper 


After the page assembly stage, completed pages are ready for printing. Depending on 
the printing process, it may be necessary to use a large-format graphic arts process camera to 
prepare photographic negatives of each page. The negatives are in turn used to expose 
printing plates. Text and line art illustrations are photographed directly on very high contrast 
negative film, whereas photographs are screened or halfioned to provide the tonal variations 
on high contrast film. If the printer is capable of printing several pages in one pass, then the 
stripper must prepare an imposition of several pages into one printing signature. 


The graphic arts process of producing printing plates from assembled pages (master 
images) has been imitated by the concept of rendering device-independent image masters 
through page description languages like Interpress from Xerox [—. Interpress] and PostScript 
from Adobe Systems [—.PostScript]. 
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e Printer 


The printing process selected by the publisher depends on the number of copies or 
impressions required. Short-run printing (up to 50 copies) can be printed cost-effectively 
with a photocopier from a paper original. Medium-run printing (from 50 to 1,000 copies) can 
be printed with an offset duplicator using an inexpensive paper-based printing plate. Long- 
run printing (from 1,000 to 10,000 copies) are generally printed with high-speed offset 
printing presses in signatures containing several pages and using metal printing plates. 


If the document requires color, then there must be separate impressions made for each 
printing ink color. Each impression requires a separate master image, one for each color of 
ink. To print images with a full range of colors, separations may be prepared by an outside 
supplier working from a slide transparency of the colored image. For a small number of flat 
colors (typically black plus one or two colors) the separations may be made by the process 
camera operator from color-keyed parts of the original document. 


e Binder 


The printed pages must be collated and bound together to form a completed document. 
The bindery specializes in taking the bulk pages, possibly in signature form. folding them, 
collating them in the correct sequence, sewing or otherwise fastening the pages together, and 
trimming the pages to finished size. The cover, whether a cloth-covered hard-cardboard case 
or a strong paper back, is attached around the document. Any printing on the cover or jacket 
must be designed and printed in time for binding. The result is a completed publication 
ready for distribution. 
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